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Abstract 

Le sujet de recherche aborde dans cette these concerne le probleme de l’energie sombre en 
cosmologie, cette forme d’energie encore non-identifiee qui domine actuellement la dynamique 
de notre univers. Nous considerons l’eventualite que ce soit une modification non-locale de la 
theorie de la Relativite Generale qui soit a l’origine de cet effet. Inspires par la theorie de 
gravite massive, nous construisons des theories non-locales dans lesquelles la gravite peut avoir 
une masse mais ou l’invariance sous diffeomorphismes n’est pas brisee. Nous nous focalisons 
sur la cosmologie de ces theories et les confrontons a certaines contraintes observationnelles. 
Sur un plan plus theorique, nous nous attardons egalement sur les subtilites de la theorie des 
champs non-locale, en clarifiant certains malentendus sur la question de stabilite. 



Resume 


Dans cette these, notre premiere motivation est de construire une theorie de gravite massive 
qui soit invariante sous transformations de coordonnees et ne fasse pas appel a une metrique 
exterieure de reference, ce qui est possible si l’on a recours a des termes non-locaux. Cependant, 
les contraintes phenomenologiques nous meneront a des modifications non-locales de la Rela- 
tivite Generale dans lesquselles la gravite n’est pas forcement massive, mais ou la cosmologie 
reproduit les observations actuelles. 

La structure dynamique d’une theorie des champs non-locale presente quelques subtilites 
par rapport aux theories locales, et ne pas en tenir compte peut nous mener a conclusions 
erronees. Nous commenqons done par l’etude de la dynamique des theories de jauge massives, 
lineaires et locales, sous plusieurs angles differents, afin de mettre en avant les proprietes qui 
ne seront pas exportables dans le cas non-local. Nous en profitons pour discuter un aspect 
interessant de la theorie lineaire d’un champ massif de spin-2, qui consiste en une symetrie de 
jauge cachee dans le secteur scalaire. Elle n’apparait que lorsque les champs non-dynamiques 
sont elimines a travel's leur equations du mouvement et, en ce sens, elle correspond a une 
symetrie de la physique, mais pas a une symetrie de l’action. 

Nous terminons l’etude des theories massives locales lineaires en les reformulant en tant que 
theories massives non-locales mais invariantes de jauge, a travers le formalisme de Stiickelberg. 
Ceci constitue notre premier pas dans les theories non-locales, meme si en l’occurrence la 
non-localite n’est qu’apparante et disparait avec un choix de jauge approprie. Cependant, la 
technologie ainsi developpee nous permet de definir une theorie d’un champ spin-2 massif linaire 
reellement non-locale et invariante de jauge. 

Suite a cela nous faisons une pause pour discuter en profondeur les subtilites des theories 
non-locales susmentionnees. La premiere est que des equations non-locales et causales ne 
peuvent pas etre obtenues a travers le principe variationnel standard applique a une action 
non-locale, mais qu’il existe cependant un principe variationnel plus general qui fait l’affaire. 
Ensuite, a travers un processus de localisation, qui consiste a reecrire les equations sous forme 
locale en introduisant des champs auxiliaires, nous voyons que le contenu dynamique de ces 
theories est plus large qu’il n’y parait. Ces champs obeissent des equations dynamiques, mais 
leurs conditions initiales sont contraintes par le choix de definition de nos operateurs non-locaux 
dans la theorie originale. Ce dernier fait implique que nous ne pouvons pas quantifie de maniere 
consistante les theories non-locales et done que ces dernieres ne peuvent etre interpreters qu’en 
tant que theories classiques effectives. 

Le contenu original de cette partie consiste a clarifie une certaine confusion qui a lieu dans la 
litterature concernant l’impact de ces champs auxiliaires sur la stabilite des solutions d’interet. 
En effet, il se trouve que dans la plupart des modeles non-locaux etudies, ces champs sont des 
“ghosts”, e’est-a-dire des champs dont l’energie cinetique est negative. Cependant, le fait que 
leur conditions initiales soient contraintes a mene certains a deduire que leur effet sur la stabilite 
classique est automatiquement uul. Nous montrons que cet argumentation est justement le fruit 
d’un raisonnement de theorie des champs locale ne s’appliquant pas aux theories non-locales. 
En conclusion, les champs auxiliaires sont tout autant capables de destabiliser une solution que 
n’importe quel champ dynamique non-contraint. Cependant, contrairement au cas quantique, 
la presence de “ghosts” n’invalide pas necessairement la theorie, car les divergences peuvent 



etre assez lentes ou meme contrees par des effets non-lineaires. C’est pourquoi, une etude de 
stabilite classique est necessaire dans chaque cas. 

Une fois ces quelques points clarifies, nous reprenons la theorie de spin-2 lineaire non-locale 
que nous avions construite et tentons de la generaliser en une theorie de gravite non-locale, 
c’est-a-dire, nous construisons des extensions non-lineaires. Pour ce faire, nous empruntons deux 
precedes differents : un qui se base sur une action non-locale et un qui opere directement au 
niveau des equations du mouvement a l’aide de projecteurs transverses. Nous obtenons ainsi une 
classe de modeles non-lineaires que nous soumettons a certaines contraintes phenomenologiques. 
Celles-ci reduisent les modeles a deux extensions a un parametre des modeles de Maggiore 
(M) et de Maggiore - Mancarella (MM) recemment proposes, qui les relient continument a la 
Relativite Generale avec une constante cosmologique. 

Ces modeles contiennent un “ghost” ultra-leger, rnais des etudes numeriques recentes et 
completes des perturbations cosmologiques montrent que les modeles M et MM sont statisti- 
quement equivalents a ACDM, dans les marges d’erreurs des donnees actuelles. Cela suggere 
que les extensions le sont egalement, puisqu’elles ne font que nous rapprocher de ACDM. Ceci 
les rends interessantes, rnalgre le fait qu’un parametre de plus diminue le pouvoir predictif d’une 
theorie. Pour finir, nous etudions numeriquement et analytiquement l’arriere-plan cosmologique 
de ces modeles. 
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Chapter 1 

Introduction 


During my PhD, the research that I have conducted within the group of my PhD advisor Prof. 
Maggiore has focused on several aspects of the problem of dark energy in late-time cosmology. 
Here are the resulting publications: 

• “Stability analysis and future singularity of the m 2 RL\~ 2 R model of non-local gravity” 
with Yves Dirian 

JCAP 10 (2014) 065 

• “Cosmological dynamics and dark energy from non-local infrared modifications of gravity” 
with Stefano Foffa and Michele Maggiore 

Int. J. Mod. Phys. A 29 (2014) 1450116 

• “Apparent ghosts and spurious degrees of freedom in non-local theories” 
with Stefano Foffa and Michele Maggiore 

Phys. Lett. B 733 (2014) 76-83 

• “A non-local theory of massive gravity ” 
with Maud Jaccard and Michele Maggiore 
Phys. Rev. D 88 (2013) 044033 

• “Bardeen variables and hidden gauge symmetries in linearized massive gravity ” 
with Maud Jaccard and Michele Maggiore 

Phys. Rev. D 87 (2013) 044017 

• “Zero-point quantum fluctuations in cosmology” 

with Lukas Hollenstein, Maud Jaccard and Michele Maggiore 
Phys. Rev. D 85 (2012) 124031 

• “Early dark energy from zero-point quantum fluctuations ” 
with Lukas Hollenstein, Maud Jaccard and Michele Maggiore 
Phys. Lett. B 704 (2011) 102-107 

An important part of this work consisted in the construction and study of a non-local theory of 
massive gravity and related non-local modifications of General Relativity that would produce 
a dark energy effect in accordance with observations. This is the subject on which I would like 
to focus my PhD thesis. 
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1.1 Background 


In the last decades the field of cosmology has witnessed an effervescence which could be com¬ 
pared to the one that permeated particle physics in the 60’s and the 70’s, resulting in the 
birth of the Standard Model (SM). As often in science, it is the development of the experi¬ 
mental/observational branch of the discipline that allows the theoretical research to blossom. 
Indeed, the important activity in observational cosmology during the last two decades turned 
the discipline into a precise quantitative science, with more and more satellite, balloon and 
ground-based missions coming to enrich and refine the data pool. This allowed theorists to 
converge on a six-parameter concordance model, dubbed “ACDM”, whose statistical predic¬ 
tions fit the data within the current error bars. These two factors, the rich/accurate data and 
the theoretical concordance model, constitute a solid basis for modern cosmology. This is still a 
very active area of research, as many more missions will take place in the future, thus providing 
more accurate input that will allow discriminating between models. 

An important aspect of the concordance model, on top of the fact that it matches obser¬ 
vations in a satisfying way, is that it mostly relies on well-understood physics. Indeed, on 
one side there is General Relativity (GR), which determines the dynamics of space-time in 
the presence of matter, and on the other hand there is the SM, which determines the content 
and microscopic dynamics of that matter. It is remarkable that the combination of these two 
pillars of modern theoretical physics suffices to describe already many aspects of the observed 
cosmology. 

Nevertheless, there are also important parts of the concordance model which still remain 
unaccounted for from the theoretical point of view. The two outstanding ones in late-time 
cosmology are referred to as the “dark matter” and “dark energy” problems. These are sig¬ 
nificant extra elements compared to what GR and the SM alone would predict. They have 
therefore greatly contributed to the enthusiasm for theoretical cosmology and in setting-up 
further observational missions. 

Before we discuss these two issues, let us also briefly mention the other important challenge 
in cosmology that is the understanding of its very early stages. The currently dominating 
paradigm, and by far, is the theory of inflation [1,2] (see [3] for a review), which consists in the 
universe undergoing a period of accelerated expansion. This is theoretically appealing because 
it naturally leads to an approximately homogeneous, isotropic and spatially flat universe, as 
the one we observe. Most importantly, however, it explains the large-scale structure by relating 
it to primordial quantum fluctuations generated during this inflationary phase. 

Dark matter 

On Earth and solar-systenr scales the dynamics of GR and the matter content of the SM 
suffice to explain the observed phenomena, at least at the level of accuracy reached by exper¬ 
iment 1 . Unfortunately, this success story does not apply to larger scales such as the galactic, 
extragalactic and cosmological ones. 

On astrophysical scales, the rotation curves of galaxies and the motions of galaxies in galaxy 
clusters cannot be explained by the masses that we see in the telescope. Rather, the observed 
motions correspond to the gravitational forces one would have had in the presence of a larger 

1 A possible exception to this statement would be the neutrino masses, which are taken to be zero in the SM, 
while it has been discovered that m v ^ 0 from measurements of neutrino oscillations. 
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amount of non-relativistic matter. On cosmological scales, it seems that non-relativistic matter 
constitutes nearly 30% of the critical density today, while the observed baryonic matter, which 
matches the expected abundance from SM Big-Bang nucleosynthesis, can only account for 
~ 5%. 

Therefore, the simplest modification one can think of, that would correct this discrepancy, 
is to include a speculative type of particle with the following properties. It should not interact 
(or very weakly) with light, thus making it practically invisible, it should be rather massive so 
that it scales as non-relativistic matter and also stable on a time-scale of the age of the universe. 
Cosmological structure formation also suggests that it is non-relativistic at the time at which 
it decouples from the original plasma, and that its interactions are dominated by gravity. 
This way that matter can clump into halos, which then provide the necessary gravitational 
potential for ordinary matter to agglomerate into the galaxies, clusters, filaments we see today 2 . 
Furthermore, the fact that no such new particle has been detected in accelerators yet, along 
with the fact that Big-Bang nucleosynthesis should not be disturbed too much, implies that it 
should interact very weakly with SM matter. This is what one refers to as “Cold Dark Matter”, 
making the last three letters of “ACDM”, 3 . 

Dark energy 

Another important effect which is theoretically puzzling lies in the trend of the late-time expan¬ 
sion of the universe. In the late 90’s, two independent groups [4,5] analyzed the light-curves of 
type la supernovae and reported that the data imply an accelerated expansion of the universe 
at late times. This behaviour has been confirmed by many satellite and ground-based observa¬ 
tions and will be further studied by missions planned for the future. The main complementary 
evidence comes from the Cosmic Microwave Background radiation anisotropies (CMB) and the 
Baryon Acoustic Oscillations in the large scale structure of matter (BAO) 4 . 

This observation was surprising because ordinary fluids such as matter and radiation can 
only produce a decelerating expansion. Indeed, from the second Friedmann equation it follows 
that acceleration implies a negative pressure p < —p/3, since the energy density p must be 
positive. In the case of dark matter, although its precise nature still eludes us, the most 
probable scenario is that it corresponds indeed to some massive particle(s) that could one day 
be detected in a collider. On the other hand, because of its negative pressure, dark energy 
seems to lie one step beyond in the scale of mysteriousness. Indeed, its properties are not the 
ones of a fluid made of standard particles and the speculations about its fundamental nature 
are much more variable. This discovery was rewarded with the Nobel prize of physics in 2011, 
given the astonishing implications for our understanding of the universe. 

Clearly, there are two, not mutually exclusive possibilities in order to explain this effect: 
either one must postulate the existence of a new source on the right-hand side of the Einstein 

2 Indeed, in the absence of that effect, it would have taken longer for ordinary matter to form the large scale 
structures, in contradiction with observations. 

3 “Cold” because it is massive, weakly interacting, and “Dark” because it does not interact with light. Note 
that a more appropriate term would be “cold transparent matter” because a dark object does interact with light 
since it absorbs it. For example, a black hole is “dark”, dark matter is not, although the name is certainly more 
catchy. 

4 It should be noted however that what is actually being measured in all of these three independent observations 
is the distance-redshift relation D(z), [6]. Thus, the possibility remains that the inferred acceleration is only an 
apparent effect of physics which influence D(z), [6]. 
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equation that would support this expansion, or one must modify GR in the infra-red so that 
acceleration is obtained by altering the behavior of gravity itself 5 . The degrees of freedom 
or mechanism which are responsible for this late-time acceleration being yet unknown, the 
community refers to them generically as “dark energy”. This energy would then account for 
nearly 70% of today’s total energy of the universe. 

From the theoretical point of view, quite remarkably, the best dark energy candidate for 
fitting the data [8] is also the simplest term one could think of in the Einstein equation, namely, 
a positive cosmological constant 


G [us + = 8-kGT^ . ( 1 - 1 . 1 ) 

This “A” is the one which is found in “ACDM” so that the name of the model reflects how 
it describes the “dark” sector. A very revealing plot is the one which combines constraints 
from type la supernovae, CMB and BAO observations, on the (w,Qm) plane, where Qm is 
the energy fraction corresponding to non-relativistic matter (dark and ordinary) today and w 
is the equation of state of the dark energy component. Assuming a spatially flat universe we 
have that the fraction corresponding to dark energy today is 1 — £Im, and also assuming a 
constant w in time one gets figure 1.1, [8,9]. Indeed, one directly sees that dark energy makes 
up approximately 70% of today’s energy budget and is consistent with the time-evolution of a 
cosmological constant since w ~ — 1. 

Now if we rather put this ~ A term on the right-hand side and interpret it as a constant 
source, we have that 

»^ T “ = S ( 1 - 1 . 2 ) 

Thus, this energy-momentum tensor has a non-diluting (constant) energy density and negative 
pressure. These are both counter-intuitive properties for fluids made of particles, but might be 
accounted for if we resort to a more “microscopic” interpretation. Indeed, a constant source 
could typically correspond to the contribution of a potential term A ~ V({(/)}) in the quantum 
effective action of some Higgs-like field in a broken symmetry phase. This kind of dark energy 
is known as “quintessence” and, along with its generalizations (“A'-essence”, etc.), represent 
one of the most studied alternatives to the cosmological constant. An important difference 
with the latter is that (4>) is not necessarily constant in time and that the new field brings in 
additional degrees of freedom in cosmological perturbation theory. 

On the other hand, if we interpret (1.1.1) as a modification of gravity, i.e. on the left-hand 
side of (1.1.1), involving just another constant of nature A, then this seems the most economic, 
conservative and also natural solution. Unfortunately, it is the quantum side of physics which 
will disagree with this interpretation. In the following section we will review succinctly the 
main arguments of the so-called “cosmological constant problem”. 

1.2 The quantum vacuum problem 

We may start by noting that the cosmological constant term plays exactly the role of the 
vacuum energy of field theory on flat space-time. Indeed, the A term in the Einstein equation 

5 It is interesting to note however that in most cases this distinction may not be clear, as it is often possible 
to reproduce the phenomenology of modified gravity models with appropriate dark energy sources [7]. 
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Figure 1.1: Left panel: the ler, 2cr and 3 o confidence regions of the combined constraints of 
type la supernovae (blue), CMB (orange) and BAO (green), without systematic errors. Plot 
by Amanullah et al. [9] using the Union 2 compilation of supernovae, the WMAP7 data for 
the CMB and the SDSS DR7 and 2dF Galaxy Survey data for the BAO (2010). Right panel: 
the confidence region for w from the Planck collaboration [8] (2013). The combined CMB 
constraints of Planck and WMAP7 alone (green line), in combination with supernovae data 
(SNSL in blue and Union 2.1 in red) or BAO data (black). The latter are a combination of 
SDSS DR7, WiggleZ, BOSS DR9 and 6dF Galaxy Survey data. 




corresponds to a constant term in the Einstein-Hilbert action 

S= 16^G / d4x ^9( R ~ 2 A) ■ U-2.1) 

In the case g^ u = this is just a constant that produces an overall energy shift. This does not 
mean that a vacuum energy has no observable effects, as is clearly demonstrated by the Casimir 
effect in QFT for instance, but that only energy differences are relevant, not absolute values 6 . 
In GR however, every kind of energy gravitates, since this is what we find by definition on the 
right-hand side of the Einstein equation, and the physics therefore depends on the absolute 
value of A. For energies way below the Planck scale, since the interactions with gravitons are 
heavily suppressed, the gravitational dynamics can be treated in very good approximation semi- 
classically. This means that gravity can be described classically, but sourced by the vacuum 
expectation value of quantum matter fields. Formally, we have 

G fii/ — 87tG(0|T^[^]|0), (1.2.2) 

although the vacuum state |0) may not be unique or easy to define. In any case, the quantum 
vacuum energy of matter is expected to appear as a cosmological constant on the right-hand 

6 See [10] for a review of the Casimir effect. 
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side. In QFT on flat space-time, each bosonic field mode brings in a vacuum energy contribution 
which is formally diverging 


E 0 (p) = ^ 'Jp 2 + m 2 (27r) 3 5 (3) (0) ->• ^ + m 2 L 3 (1.2.3) 

and must thus be regularized by inserting some infra-red cut-off length L. The total vacuum 

energy is then the integral over all the modes, which must also be regularized but with an 

ultra-violet cut-off A c L~ 4 

/■Ac J3*. r3 a4 

E ^j = + d' 2 ' 4 ) 

where the dots are lower-order terms in A c . Finally, the vacuum energy density is simply 

_ E vac _ Ap | n o 

p v ,c - L 3 - 167r 2 + • • ■ > ( L2 - 5 ) 

so the infra-red regulator is irrelevant for this “local” quantity. This computation can also be 
performed for the rest of the T™ c components and on less trivial backgrounds, if the latter have 
enough isometries, such as in cosmology for instance. Bearing some subtleties, one gets that 
T™ c = const, x for the leading order term', so this takes indeed the form of a cosmological 
constant. 

For fermionic fields, we have the same result but with the opposite sign. Thus, as soon as 
the number of bosons and fermions is not equal, we have that the “natural” value of p vac is 
as high as the cut-off of this theory, from the effective field theory point of view. For the SM, 
where we know that effective theory to hold at least up to the scale where it has been tested 
(A c ~ TeV), we have at least p\ ~ TeV 4 = 10 12 GeV 4 . As a matter of fact, since the SM 
has more fermionic degrees of freedom than bosonic ones, we should even expect a negative 
result. What is known as the “cosmological constant problem” [14,15] is that what we observe 
in cosmology is rather a tiny positive value p\ ~ 10 _47 GeV 4 , that is, a difference of at least 
sixty orders of magnitude! 


Renormalization group viewpoint 

Although the above description of the “quantum vacuum catastrophe” is probably the standard 
point of view on the dark energy problem in the community, it must be stressed that it relies 
more on theoretical hand-waving arguments than experimentally tested physics. Indeed, the 
vacuum energy is a feature of perturbative QFT whose absolute value is not observable in 
that theory, i.e. it is not an aspect of the theory which is checkable. Therefore, we do not 

'If one uses a cut-off on momentum space then the result for the leading term ~ A/ is actually p vac = pv ac /3, 
whereas if ~ g M!/ then one should rather find p v ac = —p v ac- The 1/3 ratio is the one obeyed by radiation 
and is inconsistent with a constant p because then the continuity equation p = —3H {p + p) is not satisfied, 
so this result is in contradiction with general covariance. This apparent problem arises because these cut-off- 
dependent (“bare”) quantities are not the physical (“renormalized”) quantities. Since the cut-off is imposed 
on the 3-momenta, it breaks covariance and thus so does the resulting energy-momentum tensor. The freedom 
in choosing the counter-terms then allows one to impose the correct relation for the renormalized quantities 
Pvlc = ~Pv lc- As a matter of fact, had we started with a regularization that preserves covariance, such as 
dimensional regularization, this is the result we would have obtained. Thus, the apparent 1/3 ratio is an artefact 
of our regularization scheme, and the physics cannot depend on it [11-13]. 









know if it has any physical validity for us to take into account as such when generalizing to 
generally-covariant physics. Moreover, even in QFT the absolute value of the vacuum energy 
is an ill-defined notion since one can get rid of it by choosing the so-called “normal ordering” 
for the Hamiltonian operator * 8 , i.e. this issue is related to the ordering ambiguity of quantum 
mechanics. And this is not the only argument which casts doubt on the effect of vacuum energy 
within the QFT framework. 

Indeed, an important remark is that this is merely a “naturalness” argument, not a pre¬ 
diction [11-13,16]. In QFT the parameters of the Lagrangian cannot be predicted, only their 
dependence on the probing scale can, i.e. their running under the renormalization group. Thus, 
one can a priori fix them at any value suggested by experiment at some scale, and only then 
will their values at other scales be predicted. In the case of the leading part of the cosmological 
constant, there is no dependence on the probing scale, since it is a constant, and it can thus 
be chosen arbitrarily small at all scales. The apparent unnaturalness of this choice is then due 
to the fact that the observed tiny magnitude corresponds to a huge precision compared to the 
expected value. If what we expect is of order one, then the value we wish to give is of order 
ICG 60 , i.e. 60 digits of precision with respect to the natural scale. The unnaturalness argument 
thus corresponds to this incredibly fine tunning that must be performed. However, from the 
renormalization group point of view, only the running is physical, not the absolute values of 
the cut-off dependent quantities, so the above mentioned fine-tunning is not between physical 
quantities. 

Effective field theory viewpoint 

So why should one continue taking the cosmological constant problem so seriously? The point 
is that in the effective field theory viewpoint of QFT [17-22], which is its modern interpretation, 
the cut-off-dependent quantities do acquire some physical substance. Indeed, the cut-off scale 
is usually related to the strong-coupling scale for perturbatively non-renormalizable theories, 
i.e. the energy at which the perturbative expansion breaks down. For instance, in the case 
of GR this scale would be the Planck mass. In practical examples of effective theories with 
known ultra-violet completions, the cut-off is related to the mass of some new particle, which 
is thus not seen in the effective theory, and which softens the interaction by being produced 
precisely near the cut-off. This allows us to access higher energy scales perturbatively, but with 
a larger theory encompassing the heavy particles. This is for example the case of the Higgs 
field when the effective theory is a massive Yang-Mills theory with fixed mass, or of the , Z 
bosons when the effective theory is Fermi’s theory of four-fermion weak interactions, or the 
radial mode in the effective theory of the Goldstone modes of a sigma model. In all these cases, 
the cut-off of the effective theory is related to the activation of some new degrees of freedom. 

The question that now arises is whether this effective field theory logic applies to vacuum 
energy. Indeed, by definition, the vacuum has nothing to do with particles nor interaction 
scales. Thus, as long as we are within the QFT framework, it appears that we should keep 
adding-up the vacuum energies of higher and higher momenta. This would then end only at a 
scale where the mathematical description is not QFT anymore 9 . We are aware of such a scale, 

s This is usually expressed in terms of creation and annihilation operators, but in terms of (j> and its conjugate 

momentum n it amounts to adding a singular term ~ ^(s), 7r(a;)j in H which of course vanishes classically. 

9 An analogous case is the theory of fluids, which is an effective theory of space-time fields whose underlying 
ultra-violet completion is not a field theory but the dynamics of a large number of constituent elements. In that 
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the Planck scale. Indeed, there the graviton interactions are strong and thus the structure 
of space-time becomes ambiguous, so that the local Minkowski approximation of QFT stops 
making sense. Thus, from the effective field theory point of view, we get an even larger estimate 
of the quantum vacuum energy, that is p\ rs_/ M 4 ~ 10 76 GeV 4 , giving a difference of 123 orders 
of magnitude with the observed value! 

From the above paragraphs we understand that the issue of the quantum vacuum in GR is 
not so well defined and is rather complicated, to say the least. Nevertheless, it is always a good 
theoretical exercise to look for alternative ways to describe a given phenomenon, even when 
what keeps us from choosing the simplest solution could be a matter of “semantics”. Moreover, 
with increasing observational data, these alternatives can be tested. Thus, even if ACDM turns 
out to still be a good fit in ten or twenty years, the strength of this statement would be much 
more important if several alternatives had also been considered. 

To summarize, the problem of dark energy is two-fold. First one has to come up with a 
mechanism/argument for taming the quantum vacuum. In most cases, this is achieved only at 
the cost of making p wac vanish exactly (e.g. supersymmetry), unless there is some fine-tunning. 
If Pv ac = 0, then one must also come up with a mechanism for producing some form of dark 
energy. 


1.3 Massive gravity 

As already mentioned, in this thesis we are going to explore the possibility of modifying gravity 
in the infrared in order to account for the dark energy effect, instead of considering some extra 
source on the right-hand side of the Einstein equation. One of the most studied modifications 
of the gravitational Lagrangian, motivated by both ultra-violet and infra-red physics, is the one 
where the Ricci scalar is replaced by an arbitrary function f(R). Among other modifications 
involving also tensor curvature invariants, this class is distinguished by the fact that it has no 
ghosts (see [23] for a review). Another much studied model of infrared modified gravity is the 
Dvali-Gabadadze-Porrati (DGP) brane-world model [24]. Although it has been shown to be 
non-viable, its theoretical by-products, such as the Galileon theory [25], have been instrumental 
in the development of massive gravity. 

Since GR describes a massless particle, when interpreted as a QFT on flat space-time, the 
simplest modification one can think of that hopefully alters only the infrared physics is giving 
a mass to that particle. The resulting theory of “massive gravity” has been both an inspiration 
and a (chronologically) starting point for our work on non-local modifications of gravity, so we 
find appropriate to summarize some of its important features. 

Expected advantages 

By (Lorentz-invariant) “massive gravity” is commonly meant a deformation of GR having the 
following properties: 

• In the absence of matter fields, Minkowski space-time is a linearly stable solution. 


case, one also finds that the orders of magnitude of the parameters of the fluid are related to the fundamental 
scales arising in the microscopic element interactions. 
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• The theory is Lorentz-invariant over that background. 

• The spectrum of its linearized QFT over that background is a massive spin-2 particle. 

It is not surprising that Minkowski space-time plays a privileged role in defining massive gravity, 
since the notions of particle, and thus mass, are well-defined only through the isometries of 
that background, i.e. the Poincare group. A formulation of “massiveness” which would be 
applicable to more general backgrounds would involve the notion of gap, that is, that the field 
quanta have a minimal amount of energy m > 0. Classically, whenever the background is 
symmetric enough so that a dispersion relation of the perturbations uj(k) can be defined, we 
would have that w(0) = m > 0. 

Following the general wisdom of weakly interacting theories on Minkowksi space-time, a 
mass usually makes the field both insensitive to, and of little influence on, energy-momentum 
scales obeying p, E -C m. Indeed, this is merely the fact on which effective field theory is based. 
Extrapolating these assumptions, as such, to the case of a fully non-linear theory of massive 
gravity would have the following consequences. 

First, massive gravity would be insensitive to a cosmological constant, since the latter is 
the most extreme example of infrared source. Second, the deceleration of the expansion of 
the universe should decrease as the background curvature approaches the m scale, since the 
gravitational interaction would be cut-off at energies lower than rn. This would suggest that 
the mass m should be of the order of the Hubble parameter today Hq. 

Any mechanism that would screen the cosmological constant, or more generally infrared 
sources, from gravity goes by the name “degravitation”, an idea that has been first considered 
independently of any massive theory of gravity [26-29] . This provides a very elegant resolution 
of the cosmological constant problem, by revealing that the true question is not why is p vac so 
small, but rather why it affects gravity so little. 

Finally, another expected advantage of massive gravity is that, unlike the cosmological 
constant, a small value of the graviton mass would be “technically natural”, in the following 
sense. Indeed, a naive dimensional analysis would first suggest that, under radiative corrections, 
5m 2 ~ A 2 , which is not that much of an improvement compared to p vac ~ A[f. However, 
as in any gauge theory, adding a fixed mass necessarily breaks the gauge symmetry, here 
diffeonrorphisms. In the massless case that symmetry protects the mass from being generated 
by loop corrections, so as m 2 0, the corrections should tend to zero as well. This is the 
naturalness argument of’t Hooft [30], which implies that 5m 2 ~ m 2 and thus, by dimensional 
analysis, dm 2 ~ m 2 logA c . In conclusion, the renormalized mass would be close to the bare 
one even for huge values of A c . 

Thus, following these naive expectations for a massive theory, one could obtain both a 
solution to the cosmological constant problem and possibly a naturally small dark energy. Of 
course, as stressed, these are hand-waving arguments that have no reason to apply in the case 
of non-linear theories over non-trivial backgrounds such as GR in cosmology. Nevertheless, 
they are certainly enough to tickle one’s curiosity about what kind of phenomenology a theory 
of massive gravity would imply. This has indeed been the case recently, as the passed few years 
have witnessed an important excitement in this area. However, massive gravity has a much 
longer history that dates back to the late 30’s. 
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Brief history 


Since Minkowski space-time plays a privileged role in defining massive gravity, in order to 
conceptually appreciate the theory it is convenient to adopt the particle physics interpretation 
of GR: the latter is the unique theory, under some reasonable assumptions, of a massless 
spin-2 particle with consistent interactions [31,32], Indeed, GR can be expressed as a special 
relativistic gauge theory in terms of the perturbation around Minkowski space-time hp U = 
M/2 (g^iu — ijfiu) 


-Seh = / d 4 x 


— ^ dph up d p h up + dp j h pu d p h pi/ — dph pu d u h + ^ dphd p h + 0(\h, \ 2 h 2 ,... )dhdh 

(1.3.1) 


where the indices are displaced using r]p U , i.e. the special relativistic convention. Here M = 
\/8itG (in natural units h = c = 1) is the reduced Planck mass and A = M~ l is the reduced 
Planck length playing the role of the small coupling constant. The diffeomorphisms now act as 
a non-abelian gauge symmetry on hp V 


di/^p 

= -d^v-d v ip-i p dphp V ~hp V dpi p -hp P d v i p , (1.3.2) 

whose “global” subgroup 10 are the isometries of Minkowski d(pCu) = 0) he. the Poincare group. 
This is a derivatively coupled effective field theory whose cut-off, or strong-coupling scale, is 
given by the Planck scale. 

Since h, iu is a two-tensor one can form two Lorentz-invariant quadratic combinations to 
form a mass term, these being h /1L ,h p ' 1 ' and h 2 . At the linearized level, the only combination 
which yields a linearly stable theory was found by Fierz and Pauli (FP), in 1939, to be [33] 

*Sfp = ~~Y J d 4 x {hp V h pv - h 2 ) . (1.3.3) 

The linear theory describes a massive spin-2 excitation, so by one of Wigner’s theorems, there 
are five degrees of freedom. Any other mass term will necessarily introduce a sixth degree of 
freedom which is a Lorentz scalar but is also a ghost, i.e. it has a negative kinetic energy and 
thus makes the total energy unbounded from below. 

Quite later, in 1970, it was independently realized by van Dam and Veltman [34], and 
Zakharov [35], that unlike spin-1 massive gauge theories, the spectrum of the spin-2 one is 
discontinuous in the massless limit, a feature that is known as the “vDVZ” discontinuity. 
Indeed, inverting the quadratic form of the graviton Lagrangian to obtain the propagator and 
saturating it with conserved sources one gets 


T pv {-k)Dp Vpcj {k)T p %k) = T^(-k) 


k 2 


m* 


2 VlipVvcr + 2 'Hpo-Vi/p 2 Vpu'npc 


T pa (k), 

whereas in the massless case the last factor is 1/2 instead of 1/3. This implies that in the 
massless limit one obtains the GR result plus an extra scalar pole, i.e. a “fifth force” between 
the sources 


lim T pv {-k)Dp Vpa {k)T p %k) = GR + \ T pu (-k) 

m—>0 D 


k 2 


gpi/Vpa 


T pa {k) 


(1.3.5) 


°That is, the subgroup inducing a homogeneous transformation for h 
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This means that however small the mass may be, there will be 0(1) differences with GR. For 
instance, if one fixes the normalization of M by requiring the correct Newtonian limit, then 
the bending of light by a massive object deviates by 25% from the GR prediction [31,32]. 
Moreover, if the limit m —> 0 is not continuous, the argument that makes the mass “natural” 
under radiative corrections does not necessarily hold anymore. Most importantly however, 
this discontinuity suggests that giving a mass to gravity does not only modify its infra-red 
behaviour! 

Nevertheless, this is an artefact of the linearized theory and no discontinuity appears if one 
considers the fully non-linear kinetic term. In 1972 Vainshtein [36] computed the spherically 
symmetric stationary solution perturbatively, both close to and far away from the source. In the 
latter case, he found that the zero-order part was not the Schwarzschild solution, a mark of the 
vDVZ discontinuity, and that the expansion parameter was ry/r, with ry = 
now known as the “Vainshtein radius” (Ms is the mass of the source). This implies that the 
region of validity of this solution r > ry is pushed to infinity in the massless limit since then 
ry —> oo. Moreover, as one approaches from infinity, the non-linearities become important at 
ry. On the other hand, close to the source the expansion parameter is r/ry and the zero-order 
part is the Schwarzschild solution 11 . Thus, GR is recovered close to the source and in the 
massless limit, but this cannot be seen in a perturbative expansion from the linear regime (far 
away from the source). This is now known as the “Vainshtein mechanism” and consists in 
the discontinuity of the linearized theory being “cured” by strong non-linear effects. The fifth 
force that appears in the propagator (1.3.5) is indeed present in the linear regime, but is then 
screened by non-linear effects at small scales. 

Soon after Vainshtein’s work, still in 1972, Boulware and Deser showed [39] that, unlike 
non-linear spin -1 gauge theories, considering the fully non-linear kinetic term of GR while 
keeping only the FP quadratic potential reactivated the sixth ghost mode which was precisely 
avoided with the FP tuning (1.3.3) in the linearized theory. Three decades later, in 2002, it 
was shown that this could still make sense as an effective field theory of an interacting massive 
graviton [40]. Indeed, the ghost’s mass lies above the cut-off A 5 = (m^M) 1 ^ and the later 
is parametrically larger than m. However, for a mass of the order of the Hubble scale today 
m ~ Hq one gets the very large scale 1 ~ 10 11 km, i.e. way larger than the millimeter scale 
down to which gravity has been tested. By adding higher powers to the Fierz-Pauli potential 
one can push the cut-off down to A 3 = (m 2 M) 1//3 , giving A 3 1 ~ 10 3 km, which is however 
still quite large [40]. Moreover, around a heavy source the effective theory breaks down at a 
distance that is parametrically larger than A -1 and also ry, so that one has no access to the 
region where GR is recovered [40]. 

The resolution of the ghost problem came only in 2010 in the works of de Rham, Gabadadze 
and Tolley (dRGT) [41, 42] which showed, in some special limit, that adding appropriately 
tuned higher-order terms in the potential removes the ghost at all orders in perturbation 
theory 12 . Shortly after that, it was shown that the degree of freedom count is indeed five 

11 A solution extending to all of space-time and matching the two asymptotic behaviours has been very difficult 
to find and its existence was first established numerically only in 2009 [37]. See [38] for an introduction to the 
Vainshtein mechanism and the modern approach to the subject. 

12 Moreover, this special structure of the potential has been shown to be stable enough under quantum cor¬ 
rections, in the sense that it does deviate from its ghost-free form, but that the resulting ghost has a mass lying 
above the cut-off [43]. 
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without having considered any limit and non-pertui'batively [44,45], 13 . Another advantage 
is that in the presence of a heavy source with mass Ms, the corresponding Vainshtein radius 

-1 / o 

ry ~ ( m~ 2 M~ 2 Ms ) is now larger than the distance at which the effective theory breaks 

down, so that there exists a region where GR is recovered [31,32,47]. 

Unfortunately however, the cut-off is still A 3 , although it has been argued that the actual 
region of validity of the theory could extend to higher energies [47]. Most importantly, it 
turns out that the theory admits only approximately (spatially flat) homogeneous and isotropic 
solutions [48] (for non-trivial a(i)), an important drawback for cosmology. One can have 
spatially open, or Bianchi type anisotropic solutions, but these are plagued by ghost instabilities 
[49,50]. Even so, the successful construction of an effective theory of a massive graviton with 
the above properties is a remarkable theoretical achievement. 

A review and discussion of the theoretical and phenomenological properties of the dRGT 
theories, can be found in the reviews [31,32,47]. In the present thesis, the aspect of massive 
gravity which interests us is of a more conceptual nature. Indeed, when trying to express this 
theory in terms of the full metric g^ w one inevitably ends up with in the mass term as well, 
since the latter is not generally covariant. This leads to the following conceptual issues. 

Conceptual shortcomings of the dRGT approach 

The first source of discomfort is of course the lack of invariance. To deal with it one can 
still reinterpret the theory as a generally covariant one where there exists a privileged set of 
coordinates in which the tensor 7 / takes the form 77 = diag (—1,1,..., 1). A related alternative, 
which practically amounts to the same situation, would be to consider this trivial metric 7/ as a 
dynamical field as well through a version of the so-called “Stiickelberg trick”. One introduces 
four auxiliary scalars (f) a through the replacement 

r),w -t Vab d fl (j) a d 1 ,(f> b , (1.3.6) 

so that now 77 ^ does transform like a tensor (while q a b is an “internal” metric) and takes its 
trivial form in the x ^ = 6a<j> a coordinates. 

The Stiickelberg trick is often cited as the prime example that any theory can be made 
gauge-invariant by simply introducing auxiliary fields patterned on the gauge transformation, 
a fact which is obviously true. However what cannot be retrieved after breaking diffeomorphism 
invariance with a mass is one of the founding principles of the theory: relativity. Indeed, the 
theory may be generally covariant but there exists a privileged set of coordinates, a preferred 
frame of reference, the one in which q lw becomes trivial. It must be emphasized that this 
preferred frame is determined at the theory level, i.e. it is independent of the specific solution 
we are interested in. This should be contrasted with the dynamically privileged frames that 
arise in many situations, such as the rest frame of the CMB in cosmology, or the rest-frames 
of the sun in solar-system physics. 

Another source of conceptual discomfort is the problem of choice: why 77 ? Indeed, in 
principle one could, and actually one does [51,52], consider other choices for this “reference 
metric”, which is usually denoted by f^ u , 14 . But even if the phenomenology privileges one 
of these metrics, we would still be left with a “God-given” non-dynamical field. One way to 

13 Note however that this does not necessarily imply that the Minkowski solution is stable in the fully non-linear 
theory. Indeed, it is already a remarkably difficult task to demonstrate this in the case of GR [46]. 

14 In this case Minkowski space-time is not guaranteed to exist as a stable solution. If the background is 
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solve this issue is bimetric gravity, first proposed in [53] and recently extended to a ghost-free 
theory of massive bigravity [54,55], in which case one considers an Einstein-Hilbert kinetic term 
for the reference metric as well, making it dynamical and restoring explicit general covariance 
and relativity. A second dynamical metric opens a whole new window for the above mentioned 
conceptual issues and actually does exhibit a stable flat Friedmann-Lemaitre-Robertson-Walker 
(FLRW) solution [56]. This has also been an active area of research lately, but unfortunately 
it is seems hard to obtain models where all perturbations are bounded on the backgrounds of 
interest [57-59]. 

The above considerations lead us to wonder whether there might be a way to construct a 
theory of massive gravity in terms of a single metric g^ u that is both explicitly covariant and 
privileges no reference frame. It turns out that this is possible, but that the price to pay is the 
loss of space-time locality. 

1.4 Non-local gravity 

A non-local theory is a theory in which the equations of motion are not differential but integro- 
differential, with both space and time integrations. Therefore, the dynamics of the field at 
x do not only depend on the values of this held in the infinitesimal neighborhood of x, but 
on a finite or infinite region of space-time. In particular, in the case of time non-localities 
the corresponding physics exhibit memory effects. Since the held value at t + df depends on 
the held values on a finite past interval [U,t], the held “remembers” its history. Here we will 
restrict to non-local operators that are the inverses of some differential operators. Then, general 
covariance will imply that space and time non-localities come together. 

Non-local modifications of GR have been considered in the early attempts to construct 
degravitating mechanisms [28,29]. Moreover, they also appear from loop corrections to the 
quantum effective action for the metric, i.e. the action for the expectation value (guv) [60-64]. 
Based on this justihcation, phenomenological non-local modifications of GR have already been 
considered as possible explanations of dark energy, with [65] being the pioneering one. More 
generally, non-local effects may appear in many classical effective descriptions where dissipative 
effects or subsystems are considered [66,67]. 

In our work during my PhD we have hrst started by trying to construct a generally-covariant 
theory of massive gravity at the price of non-locality [68], based on an earlier construction 
[29, 69] which rather focused on its degravitation properties. The corresponding cosmology 
not being viable, we proceeded with the study of non-local modified gravity models that are 
still controlled by a fixed mass parameter, but in which the graviton remains massless [70,71]. 
These theories contain ghost modes, i.e. fields with negative kinetic energy, and we have spent 
some time understanding their effects both at the classical and quantum levels [68,71,72]. 
Independent of the work in which I have been involved, the group has been very productive on 
the phenomenological analysis of these models [73-78]. 

g^u 7 ^ Viiv, and not necessarily v , then the held h transforms homogeneously only when the diffeomorphism 
generator p' is a Killing vector of g Thus, the global space-time transformations are not the Poincare group 
any more and the notion of a massive particle becomes ill-defined. 
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1.5 Thesis summary 


In this thesis we will describe part of the above-mentioned work and will also try to extend a 
bit further some of its concepts, constructions and conclusions. In the second chapter, we will 
start by revisiting linear massive gauge theories, since manipulating them will be important 
in understanding how to construct and especially analyze non-local theories. In particular, we 
will see how the field components of these theories split into dynamical/non-dynamical modes 
and the relation to the constraints of gauge theory, an identification which will be crucial in the 
non-local case. Part of this analysis will also cover a study that we carried out in [79] before we 
started the research on non-local gravity. It concerns a hidden symmetry in massive linearized 
gravity and the thorough analysis we will perform here will hopefully allow us to understand 
that feature better. The chapter will end with a non-local formulation of these local theories 
and a construction of a more general, genuinely non-local, theory of a linear massive graviton, 
with a scalar mode that is not necessarily a ghost. The latter part contains unpublished original 
material. 

This will bring us to the subject of non-local field theory, so in the third chapter we will 
discuss the many subtleties that arise when considering non-localities. Indeed, a first feature 
is that the variational principle has to be generalized in order to provide causal equations 
of motion. Moreover, non-local theories cannot be quantized without enlarging their set of 
solutions in the classical limit, so that they can only be interpreted as classical effective theories. 

Most importantly however, their dynamical structure must be clarified in order to properly 
settle classical stability issues. This is a subject that has not been treated rigorously enough 
in some important part of the related literature, in my opinion. An original part of this thesis 
consists in unveiling the misunderstanding that lies at the origin the confusion. Indeed, as 
we shall see, one has to separate the notion of degree of freedom and dynamical field (or 
“radiative”, “propagating” field). Whereas the two notions are equivalent in local field theory, 
this is no longer true in the presence of non-localities. If some field has its initial conditions 
constrained, and thus does not represent a degree of freedom, this does not necessarily mean 
that it does not propagate. 

Then, in the fourth chapter we will come back to the linear non-local theory constructed 
in chapter 2 and we will try to extend it to a generally-covariant non-local theory of massive 
gravity. There are two possible procedures, the “action-based” one and the “projector-based” 
one, whose resulting theories can be very different. After having constructed a class of models 
in both cases, we will apply some phenomenological constraints in order to reduce the number 
of free parameters. For the projector-based model the result will be that the tensor modes 
cannot be massive, while in the action-based model they can, but the corresponding mass term 
is irrelevant for the cosmological background. Since this is the part that will interest us here, 
the action-based model can also be taken with zero tensor mass. What is then left is the mass 
of the scalar mode, and the two models are one-parameter extensions of the models proposed 
by Maggiore [73] and Maggiore and Mancarella [75]. The extensions continuously interpolate 
between these models and GR with a cosmological constant, so that the phenomenological 
successes of the former should remain valid for the extended models as well. 

In the last chapter we will analyze the background cosmology, using both numerical sim¬ 
ulations and analytical approximations. The analysis of the one-parameter extensions is an 
original part of this thesis and confirms that they become indistinguishable from ACDM for 
large values of the extension parameter. We will finish with a discussion of the fact that these 
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solutions are phenomenologically viable, despite the presence of a ghost mode. 

Finally, in the appendix A we have tried to provide a more or less rigorous mathematical 
support for the non-local operators that are invoked in generally-covariant non-local theories. 
These correspond to the generalization of the integration kernels of Green’s theory, which 
are convolved with functions, to “bi-tensors” in differential geometry, that are convolved with 
tensors. The appendix also contains derivations of the properties of these operators that are 
most useful to us. For the reader who is less interested in these technicalities, rest assured 
that whenever some property or definition will be used, on top of referring to sections of 
this appendix we will also give lighter explanations that should satisfy (but not bore) a more 
physically-oriented mind. 

I acknowledge the use of Mathematica and especially of the “xACT” package for symbolic 
tensor computations [80]. 


1.6 Notation &; conventions 


We work on a D-dimensional manifold At, also define d = D — 1 and we focus on the case 
D > 4. The manifold Ad is equipped with a Lorentzian metric g, that is, a symmetric covariant 
tensor of rank 2 whose component matrix g pv in some local coordinates has eigenvalues with 
the sign signature (—, +,...,+) and thus g = det^j,) € M* - . We denote by the Minkowski 
metric rj = diag(—1,1,..., 1) and use the convention £qi ...d = — e m -- d = +1 for the Levi-Civita 
symbol, so that 

Jy V-ge^ 1.../JD dx Ml A • • • A dx^ D = y/^gd D x , (1.6.1) 

is the volume D-form. For the Riemann and Ricci tensors the conventions are 


/?p = f T R — B V p 4- F p F“ -i- F p T c 

1 a fib' — av '-'v*- cjfi 1 a/i x civ 1 otv 1 - 


(Tfl J 


R„u = R p 


fipv f 


R, = g pv R, lv , (1.6.2) 


and for the Christoffel symbols 

r V = \ 9 Pa ( d ^9vp + dvg^p - d p gp„) . (1.6.3) 

We use □ = to denote the d’Alembertian and A = didi to denote the Laplacian on 

flat space-time. The space-time Fourier transform convention is 


<t>{-*) = J J^J(d ex P [ ir lpuk‘ J 'x 1 '] , 4>{k) = 

so for consistency the spatial Fourier transform is 


d^x cj)(x) exp [— iri flv k tJ 'x v ] , (1.6.4) 


<l>$) exp 


ik ■ x 


4>{k) = / d d x(f>(x) exp 


—ik ■ x 


(1.6.5) 


We use natural units h = c = 1 and also the following reduced Planck masses M = ( 87 tG) 
and M = ( 167 tG) - 1 / 2 , which are actually masses only in D = 4. 
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Chapter 2 

Linear massless/massive gauge 
theories 


In this chapter we propose to study the massive and massless theories of spin-1 and spin-2 
fields through several approaches, each one of them providing a complementary viewpoint. As 
already mentioned in the introduction, the notions of degree of freedom and of dynamical field 
are not equivalent in non-local field theory. It is therefore important to first understand their 
equivalence in local field theory, and especially gauge theory, where not all fields propagate. 
We will thus see, in many different ways, how the field content splits into dynamical and non- 
dynamical fields and how this is related to the degrees of freedom of the theory. This will 
then allow us to understand the spectrum of non-local gauge theories, without making any 
confusion between the constraints that are due to non-locality and the ones that are due to 
gauge symmetry. Finally, this analysis will also bring us useful by-products that will allow us 
to construct linear non-local massive spin-2 gauge theories. 

Although our main interest is in gravity and thus the spin-2 field, the spin-1 case will be 
very helpful in facilitating our intuition and argumentation. Indeed, it shares many properties 
with the spin-2 case, but at the same time has less fields, thus simplifying our analysis. On 
top of this, the spin-1 theory stands as exceptional, regarding some important properties, when 
compared with higher spin theories s > 2. Thus, the study of the spin-1 case will turn out to 
be essential in contrasting with some peculiarities of the spin-2 case. 

For the kinetic term of the theory, in each case, we will consider the only one that is 
stable, i.e. the one that exhibits the highest gauge symmetry. These are the kinetic terms 
of electrodynamics and of linearized GR. For the mass terms however we will consider the 
most general quadratic Lorentz-invariant potential, which in the case of the spin-2 field usually 
activates a ghost mode. Indeed, that ghost will be a recurrent subject in this thesis, so it is 
important that we include these actions as well in our study. Moreover, considering this general 
case will lead us to the definition of projectors that are going to be very useful for constructing 
a genuinely non-local ghost-free theory. This chapter is based on, and extends, the following 
papers [68,79]. 
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2.1 Technical preliminaries 


2.1.1 Inverse differential operators 

In this chapter we will consider only spatially localized fields, that is, fields which tend to 
zero sufficiently fast at infinity and which can therefore be represented by their spatial Fourier 
transform. On this space of fields the operator A — m 2 , where A = <9*<9* is the Laplacian, is 
negative-definite, as is obvious in its Fourier representation. It has therefore zero kernel when 
acting on fields whose values and first spatial derivatives tend to zero at spatial infinity. This 
means that it admits a unique (right and left) inverse (A — m 2 ) , and actually a unique power 

(A — m 2 ) for k£ 1, which can again be obtained through its Fourier representation. These 
operators commute among themselves and with spatial derivatives. 

These nice properties do not generalize to the Klein-Gordon operator L = □ — m 2 because it 
has a non-trivial kernel, the vector space generated by the plane-wave solutions (see appendix 
A.2.2 for detailed properties). It therefore admits more than one right-inverse LL -1 = id and 
no left-inverse in general. The space of inverses is parametrized by the elements of the kernel 
since any two inversions are related by a homogeneous solution 

L [L ~ l (<£)- L'- 1 ^)} =0. (2.1.1) 

Thus, if one picks a L _1 once and for all, all other inversions are found by adding a homogeneous 
solution, as we know from calculus. Here we will denote by “L -1 ” the inverses of L that are 
also R-linear operators 

L -1 (a<p + /?<//) = aL _1 0 + a, j3 = const G M, (2-1.2) 

which must be contrasted with the general inverse operator which is affine 

L gen. W>) = + L*l> = 0, (2.1.3) 

with ip independent of (p. The operators L _1 can then be represented by the convolution with 
a Green’s distribution 

(L"V)(x) = J d D yG(x,y)(p{y), L x G(x,y) = 5 {D \x - y ), (2.1.4) 


which by Poincare covariance must be of the form G(x , y) = G(x — y). The quantity iG is also 
called a “propagator” depending on the context. The different choices of L _1 now correspond to 
the different time boundary conditions of G(x), which in turn correspond to the time boundary 
conditions of ( L~ 1 (p){x ), 1 . 

Two Green’s functions are of particular relevance for physics on flat space-time, the retarded 
one in classical field theory and the Feynman one in perturbative QFT. Imposing trivial initial 
conditions 

lim G(x) = 0, lim d x oG(x) = 0, (2.1.5) 

:r 0 —l—oo :r 0 —>■—oo 

gives the retarded propagator 


G r (x) = lim 

e->0+ 


d D k exp [ir] lill k tl x v ) 
[2tt) d (/jO ^g)2 _ £2 _ m 2 


1 Given the set of fields we consider, the spatial boundary conditions are zero at infinity. 


( 2 . 1 . 6 ) 
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while imposing no positive-frequency ingoing waves and no negative-frequency outgoing waves 


lim G(x) 

x°—>—oo 

lim G(x) 

x°—>-+oo 


J (2n) d 
r d d k 

J (2^F 



dfc° 

~2 7 

dfc° 

~ 2 Tt" 


a(k ) exp [ir] tlu k fJ, x l/ ] 
a(k ) exp [irj tlv k ti x v ] 


gives the Feynman propagator 

Gf(x) = lim 
£-> 0 + 


d D k exp (ii'l fl uk^x u ) 
(2 tt) d —k 2 — m 2 + ie 


(2.1.7) 

( 2 . 1 . 8 ) 


(2.1.9) 


Indeed, by writing (2.1.4) in Fourier space, and using the converging contour integrals with the 
residue theorem, we get that L _1 ^> obeys the above mentioned boundary/initial conditions in 
each respective case. The domains of definition of the corresponding operators L~ l and Lp 1 
are the fields obeying the same boundary conditions as G in each respective case. On their 
respective domains of definition, both operators commute with partial derivatives and are also 
left-inverses 2 . In practice the L 7 1 may act after some derivatives, in which case it is convenient 
to have a stronger condition for its applicability. At the bottom of appendix A. 2.2 we provide 
such a condition which we call “having finite past”. Loosely speaking, it amounts to </> being 
non-zero only after a finite time. 

The retarded Green’s function arises in situations where one wants to solve a sourced 
equation 

L<f> = J , </> = J d D yG T (x - y) J(y) , (2.1.10) 

in a causal way, i.e. such that (j)(x) depends only on J(x') with x' in the past light-cone of 
x. This is indeed the case as we can see by the real space representation in D = 4 given in 
equation (A.2.22) of appendix A.2.2. Flipping the sign of e in (2.1.6) amounts to flipping the 
sign of x°, after having redefined A : 0 —> — k°, so this gives us the advanced propagator G a which 
is supported on the future light-cone and is thus anti-causal. We thus have 


G t (-x°,x) = G a (x°,x ), 


( 2 . 1 . 11 ) 


while G r (x °, x) is symmetric under the individual sign flip of spatial arguments. In perturbative 
QFT it is rather the Feynman propagator which is relevant because it is the one that arises 
in the computation of the scattering amplitudes. More precisely, it represents the particles of 
<f> which mediate the interaction between sources J at different space-time points. To see this 
one can invoke the corresponding action 


S = lim 

e— >0+ 


d D x 


-(/>(□ — m 2 + ie) (j) — 4>J 


( 2 . 1 . 12 ) 


which has been regularized with an e factor that ensures the convergence of the corresponding 
path integral. Thus, unitarity of e lS forces upon us this choice for the sign of e. We then have 

2 See appendices A.3.2 and A.3.3 where we show this for D,^ 1 in real space and on arbitrary globally hyperbolic 
space-times. It can also be worked-out in Fourier space for both Dy 1 and Dp 1 , since if the Fourier representation 
gives a finite result, i.e. if the operators are defined, then it is obvious that they commute with the derivatives 
and are also left-inverses. 
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that by integrating-out (j> 


D(j>e 


iS 


exp 


d D x JGpJ 


(2.1.13) 


Differentiating twice (2.1.12) with respect to the source one gets that the Feynman propagator 
is the two-point function 

<0|^(*)£(*)|0> = - ‘ . . (2.1.14) 

k- + m* — te 

Actually, this path integration has been performed a bit formally since we have not specified 
its boundary conditions B. However, these are already fixed for consistency reasons and there 
are several instructive ways to see this that will be useful for us at some point later on. First, 
note that the path integral is dominated by the classical solutions, which in this case are given 
by free wave-packets at infinity (where J = 0) with dispersion relation 


k° = ± 



(2.1.15) 


Thus, positive-frequency modes diverge at past infinity, while negative-frequency modes di¬ 
verge at future infinity. This means that the only boundary conditions for which the path 
integral makes sense around classical solutions are the Feynman ones (2.1.8), i.e. only negative- 
frequency waves at past infinity and only positive-frequency waves at future infinity. Conversely, 
if one imposes these boundary conditions but sets e = 0, then the result of integrating-out 4> is 
the Feynman propagator. One can also understand these boundary conditions from the point 
of view of the canonical quantization. One simply needs 


(0|T...|0) ~ / D(f> ... e 


iS[0] 


(2.1.16) 


where |0) is the vacuum state at past inhnity and (0| is the one at future infinity. We then have 
that a|0) = 0, where a is the free annihilation operator corresponding to the amplitude of the 
modes with positive frequency, while (0|ai = 0, where a) is the creation operator corresponding 
to the amplitude of the modes with negative frequency. 

Finally, note that since G-p{k) is a function of k 2 , we have that Gf(x) is symmetric under 
the individual flip of any of its arguments, so it is symmetric under time-reversal in particular. 
As a consequence it has both retarded ~ 9(x° — y°) and advanced ~ 9(y° — x°) parts. This is 
expected because in a scattering process the information of the whole interval f 6] - oo,oo[ is 
required, so that for finite t the dependence is acausal. 


2.1.2 Degrees of freedom, dynamical and non-dynamical fields 

In non-local theories the question of degrees of freedom of a theory can be a subtle issue, so 
it is important that we define clearly the words we will be using. The number of degrees of 
freedom of a field theory, denoted by IVf, is the number of initial field configurations that we 
are free to choose in order to evolve the system uniquely in time. In the theories we are going 
to study below we will find two types of fields. The “dynamical” (or “radiative”) ones are those 
obeying a second-order equation in time 

(□ — m 2 ) (f> = J, (2.1.17) 
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while the “non-dynamical” (or “non-radiative”) ones are those that obey a purely spatial dif¬ 
ferential equation 

(A - m 2 ) <f> = J . (2.1.18) 

In the dynamical case (2.1.17) the solution for a (j) which is solely excited by J takes the form 
(2.1.10). This means that, by measuring (j) at some x, one can deduce some information about 
the excitations of J at some other x' (as long as x is in the future light-cone of x'). We thus say 
that the field “propagates” the information of the source. This is how one can gain information 
about a distant object, by detecting the waves it emits in some dynamical field. Going even 
further, this is how two “sources” at different space-time points are going to interact through 
the “force” mediated by q i. Note that this scenario does not focus on the initial conditions 
that would have been given to qb. These are actually trivial since </> is solely excited by the 
source. Thus, the forces that are present in the theory correspond to the dynamical fields, 
independently of whether these are degrees of freedom or not. Finally, since the dynamical 
fields induce poles in the propagator, and “propagate” the information of sources, one can 
equivalently refer to them as “propagating” fields. 

In the non-dynamical case (2.1.18) the equation seems to be in conflict with relativity since 
it is not Lorentz invariant and implies an action at a distance, i.e. (j) reacts instantaneously to 
the source J. As we will see however, in these cases, either (j) will not be physically observable 
(gauge-dependent), or it will itself be a spatially non-local functional of the fundamental fields. 
In the latter case the measurement of (j) is spatially non-local to begin with and can thus not 
be performed at a single time, so there is no contradiction with relativity. In that case, the 
information of the source does not propagate but is instead communicated simultaneously, to 
an unphysical or non-local field. Thus, non-dynamical fields do not allow us to gain local 
information on the source’s dynamics nor do they mediate any interaction. 

Now, in the dynamical case, we have that one needs to provide the initial conditions 4>(ti, x) 
and x) on in order to evolve the field in time, so that it corresponds to IVf = 2. In the 
non-dynamical case we have that the field is totally determined by the source at every time 
and, in particular, at the initial condition surface, so that IVf = 0. In the dynamical case the 
solutions for J = 0 are linear superpositions of plane-waves, whose vector space is isomorphic 
to the initial data space, while in the non-dynamical case the source-free solution is <p = 0. 

It therefore seems obvious that, if one denotes the number of dynamical fields by IVd, then 
IVf = 2 IVd, 3 - This appears as a trivial statement in local field theory, but does not hold at all 
for non-local theories. It is thus important to stress in advance that the notion of dynamical 
field and degree of freedom should be considered separately. 


2.2 Standard Lagrangian approach 


2.2.1 Spin 1 

Massive 

So let us start by considering the case of massive electrodynamics, that is, the Proca action 


S = 


d D x 



F^F^ 




F fW = d. Av ~ d v A tl , (2.2.1) 


3 If the dynamical equations where of order n in the time-derivatives, this would give IVf = nNd- 
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where j fJ is a conserved external source, i.e. d^j 1 ' 1 = 0 , and the mass parameter m breaks the 
U(l) gauge symmetry 4 * * * * * * 

SA„ = -d tl 9 . (2.2.2) 

The equations of motion are 

df,F^ - m 2 A v = -f , (2.2.3) 

and taking the divergence one gets 

m 2 d fl A^ = 0, (2.2.4) 

so we can rewrite them as 

(□ - m 2 ) A^ = -j hl , d fl A 11 = 0. (2.2.5) 

Thus, as soon as m / 0, and therefore the gauge symmetry is lost, the usual Lorentz gauge 
condition of massless electrodynamics d^A^ 1 = 0 appears as the scalar part of the equations of 
motion. The latter along with the fi = 0 components of the Klein-Gordon equation imply that 
Aq is non-dynamical 

(A - m 2 ) A 0 = diAi - j 0 , A 0 = diAi , (2.2.6) 

and that its initial conditions are totally determined in terms of the ones of A{ and jo- We are 
then left with 

(□ - m 2 ) Ai = -ji , (2.2.7) 

that is, d unconstrained fields transforming in the vector representation of SO (d) and obeying 
a massive Klein-Gordon equation. This amounts to IVf = 2 iV c i = 2d degrees of freedom, 
corresponding to the initial conditions of A{ and A j. In d = 3 this gives = 3. 

Massless 

In the case where m = 0, we have the gauge symmetry (2.2.2), so the Lorentz gauge = 0 

can be reached by performing a gauge transformation, the result being again (2.2.6) and (2.2.7), 
but with m = 0. Now however these equations have a residual gauge symmetry given by the 
gauge parameters satisfying = 0. To see what we can do with it, we can consider the general 
solution of the divergence of (2.2.7) 

diAi = 0 hom - n-'diji , (2.2.8) 

where 0 hom is a homogeneous solution [Z](/> hom = 0. Remember that for the action of to be 
defined the source diji must have finite past. Using the residual gauge transformation on that 
equation we get 

diAi - A 0 = 0 hom - D-'diji - (2.2.9) 

4 In realistic cases where j M is also made of fundamental fields, the argument that the m = 0 action is 

gauge-invariant because d^j 11 = 0 no longer holds. Indeed, conservation equations can only hold for some field 

configurations, namely the on-shell ones, whereas a symmetry should hold for all field configurations in the 

action. There are then two possibilities. Either the A^j 11 term corresponds to non-minimal couplings to other 

fields through F^, in which case it is itself gauge-invariant, or it emerges through minimal couplings that involve 

the covariant derivative V = d — iA, in which case its variation is compensated by a non-trivial variation of 

T-independent terms. Then, because of that gauge symmetry, by Noether’s theorem for local symmetries we 
have that d^j 11 = 0 on-shell. In the massive case, if the matter sector is unchanged, then we still have a global 

U(l) symmetry and it is thus Noether’s theorem for global symmetries which implies d^j^ = 0. 
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( 2 . 2 . 10 ) 


It is thus possible to cancel <4° m by choosing 

0 = -A"V hom , 

so that 3*4 is totally determined by the source and its initial conditions are thus fixed. The 
degrees of freedom are therefore the ATf = 2 (d — 1) components of the transverse part Aj, i.e. 
diA\ = 0, and its first derivatives. 

It may appear however that the longitudinal part 3*4 is still a dynamical field, since it 
obeys a dynamical equation □3*4 = 3*j*, even though it does not correspond to a degree 
of freedom. This would be in contradiction with Ay = 2A4 As it turns out, this is only an 
artefact of our choice of gauge, which is the natural one from the point of view of the massive 
theory, since then d^A^ = 0 holds continuously with m — > 0. Indeed, one can always introduce 
a □ 1 7 1 J term in the gauge parameter to make it appear as a source of a gauge-dependent 
component. We can therefore choose a different gauge to start with, such as the one which 
precisely eliminates the longitudinal mode 

< 9 * 4 ; = 0 . ( 2 . 2 . 11 ) 

This choice is more natural from the Hamiltonian point of view, as we will see soon. The 
equation of motion of 4 then reads 

A4) = —jo, (2.2.12) 

and we have that the divergence of the equation of 4 is automatically satisfied. We thus have 
that the initial conditions of both 4) and <9*4 are fixed and that these fields are non-dynamical. 
We can therefore conclude that in the massless theory we have indeed N{ = 2Aj] = 2 (d — 1), 
which for d = 3 gives 4 = 2. 


2.2.2 Spin 2 

Let us know consider linearized GR along with the most general quadratic potential 

S = 


d°: 


-\d p K p d p h vp + d^p&'Kp - d p u w d v h + \ d ll hd ll h 


m 2 ( 44 ^ - (1 + a)h 2 ) + h pv T^ 


= / d 


D. 


h pu 8^ pa h pa - - m 2 (44^ - (1 + a)h 2 ) + h pv T pv ^ , (2.2.13) 

where 8 is known as the “Lichnerowicz operator” 

8 pvpa = - r^rT) n “ V^ p d a) d u - r\ v{fi dfi + rf'dPSP + r) pa d p d u . (2.2.14) 

The Fierz-Pauli theory corresponds to the choice a = 0. Here T pu is some external conserved 
source d p T pu = 0 and the mass term breaks the following linear gauge symmetry 

Sh pv = -44 - 34/, • (2.2.15) 

The equations of motion are 

(□ - m 2 ) h pu - rjftu (□ - (1 + a)m 2 ) h - d p d p h pv - d u d p h pp + 'n p , u d p d <T h pa + d p d u h = —T pu , 

(2.2.16) 
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their divergence is 

m 2 (d^h^ u - (1 + a)d u h) = 0, (2.2.17) 

and their trace is 

(D - 2) (d^h^ - Oh) + ((1 + a)D - 1) m 2 h = -T . (2.2.18) 

Taking the double divergence, we can simplify the trace equation to 

(D — 2)aOh + ((1 + a)D — 1) m 2 h = —T , (2.2.19) 

which is a dynamical equation for h only when a / 0, In the following sections we will see 
that in this case the kinetic term of h has the wrong sign with respect to the rest of the fields, 
so that h is a ghost. For the moment, we can already say that if a = — (D — 1)/D then h is 
massless, because the mass term solely depends on the traceless part — rj^h/D, while if 
a < — (D — 1 )/D then h is also a tachyon. In particular, for a = —1/2 it is a tachyon with 
mass —m 2 . On the other hand, if a = 0, then (2.2.19) becomes an algebraic equation for h and 
the latter gets totally determined by the source 

= ( 22 - 2 °) 

so that it is no longer a degree of freedom, nor a dynamical field. Another peculiarity of this 
choice for a is that the double divergence of the equation of motion, i.e. the divergence of 
(2.2.17), is gauge-invariant 5 . This suggests that, although rri ^ 0, there is some kind of leftover 
gauge symmetry in the equations of motion, in contrast with massive electrodynamics where 
both the equation and its divergence are not gauge-invariant. However, the equations of motion 
are not invariant under any gauge transformation (2.2.15), even a pure-scalar one = d^9. We 
will understand this point better in the following sections, when we will have the appropriate 
technology at our disposal. For the moment we can simply note the interesting fact that for a 
pure-scalar transformation 

5h^ = -2d^d u 9, ( 2 . 2 . 21 ) 

the action varies by 

<55 ~ 9 {d^d u h^ - (1 + a)Oh) + a (1 09) 2 , (2.2.22) 

so for a = 0 this is proportional to the divergence of (2.2.17) and therefore vanishes for on-shell 
hfj, u configurations 6 . Thus, if li^ u is a solution, then 

5[V] = *5[V - 29^9}. (2.2.23) 

Clearly, something special happens at the Fierz-Pauli point a = 0, although it is not a gauge 
symmetry. So let us start with this a = 0 case. 

5 It is actually the linearization of the Ricci scalar. 

c This corresponds to the well-known fact that, when h takes the form h for some function <j >, 
the Fierz-Pauli mass term is a total derivative. The generalization of this property to terms of cubic and higher 
order in <9 M cb</> gives rise to the Galileon family of operators [25]. 
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Massive a = 0 

Using (2.2.20) and (2.2.17) the system of equations simplifies to 


(□ - m 2 ) = -T^ + - 

df jir - d v h = 0 , 


r hw T - ^ T 


m “ 


h = - 


dm 2 


T. 


The latter allows us to fix h, 


oo 


^-oo — ha + 


1 


dm 2 


T. 


The i component of (2.2.25), along with the Oi component of (2.2.24), fix ho 

1 


(A nr) hoi — djhjj Toi , 


hoi — h)j hij T 


dm 2 



(2.2.24) 


(2.2.25) 


(2.2.26) 


(2.2.27) 

fix hoi 


;diT. 

(2.2.28) 

(2.2.24), 

we get 


(2.2.29) 


We finally split hij into its trace ha and traceless part hij, and isolate /ioo> hoi, ha in the above 
equations 


( d - 1 2 

—-— A — rn 

V d 

(A - to 2 ) hoi 
(A - to 2 ) h 0 i 

- A - to 2 ^ ha 
(A - TO 2 ) ha 




didjkj-Too + ^T-^-^AT, (2.2.30) 

djhij - T 0i + ((d — 1) A — dm 2 ) 1 <9* (djd k h jk - djT 0j ) ,(2.2.31) 
(A - to 2 ) + ~^~2 diT ^j + \ di (djdkhjk - T 00 ) , (2.2.32) 

didjhij — Too i (2.2.33) 

didjhij - diToi + ((d - 1) A - dm 2 ) 1 A (djd k h jk - djT 0j ) , 

(2.2.34) 


so the corresponding initial conditions are determined by the ones of hij and T) tl/ and these 
fields are non-dynamical. We are thus left with only hij being unconstrained, obeying a massive 
Klein-Gordon equation (the spatial traceless part of (2.2.24)) 


(□ TO 2 ) hij — Tij ^2 (didj ^ 


T , 


(2.2.35) 


and transforming in the tensor representation of SO(d). We thus have JVf = 2 No = D 2 — D — 2, 
which in D = 4 gives No = 5. 
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Massive a/0 

Let us now move on to the a / 0 case. As we have seen already in the a = 0 case, the equations 
for the components can easily become lengthy in the process of spotting the non-dynamical 
fields and their precise form is not particularly illuminating. This is even worse here because 
of the undetermined a parameter. We therefore propose to simply sketch the procedure for 
generic a and then give the precise equations for the case a = — 1/2 which is considerably 
simpler. In the subsequent sections where the method of analysis will be more suited, we will 
treat the generic case explicitly to see that it is not qualitatively different from a = — 1 / 2 . 

So let us sketch the procedure for the generic case. In the a = 0 case, the trace equation 
eliminated Loo, so we were able to use the divergence equation to eliminate hoi and ha . Here, 
since the trace is dynamical, we have that either Loo, or h lt will remain dynamical. More pre¬ 
cisely, using the 0 component of (2.2.17) and the appropriate combination of the 00 component 
of (2.2.16) and (2.2.19), we find non-dynamical equations for Loo and Lo* which fix the initial 
conditions in terms of the ones of ha and Lo*. We can then use the Of component of (2.2.17) 
along with the Of component of (2.2.16) to do the same for Lo*. We are then left with the 
ij component of (2.2.16) in an appropriate combination with the 00 component and (2.2.19), 
which yield dynamical equations for the unconstrained fields hij. This therefore corresponds 
to Nf = 2A r ( j = D 2 — D, or = 6 when D = 4, i.e. the trace h tl (or equivalently the Lorentz 
trace L = ha — Loo) is now part of the dynamical spectrum. 

In particular, for a = —1/2, equations (2.2.16), (2.2.17) and (2.2.19) can be brought to the 
simple form 

(□ - m 2 ) V = -Tv *, d tJ hr = 0, (2.2.36) 

where 

hfiu — h^i/ — . (2.2.37) 

Using the second equation along with the 0 n component we get 

(A - m 2 ) hot = djhij - T 0i , h 0i = djhij , (2.2.38) 

and 

(A - m 2 ) L 0 o = didjhij - T 00 . 

These fields are thus non-dynamical and their initial conditions are fixed 
of hij and T pv . We are thus left with h %3 obeying 

(□ - m 2 ) = -Tij , (2.2.40) 

so Nf = 2 Nff = D 2 — D, and in particular N { d = 6 for D = 4. 

Massless 

Let us now pass to the m = 0 case. First remember that (2.2.17) is a possible choice of 
gauge only if a ^ 0, since otherwise its divergence is gauge-invariant. We therefore have that 
the m = 0 case follows from the massive a / 0 case by simply setting rn —> 0 , although 
now (2.2.17) is obtained by a gauge transformation, as in the spin-1 theory. We work in the 
gauge corresponding to a = — 1/2 so that our equations are (2.2.36) with m = 0. Again, as 


(2.2.39) 
in terms of the ones 
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in electrodynamics, there is a residual gauge symmetry given by the gauge parameters that 
satisfy = 0. The divergence and trace of the spatial part of (2.2.36) read 

Udjhij = -djTij , Dhu = -Ta , (2.2.41) 

and their solutions take the form 

djhij = 4 om - □; -'djTij , hit = 4> hom - U-'Ta , (2.2.42) 

where Ocj)f om = 0 and D(/) hom = 0 are homogeneous solutions. We can then perform a residual 
gauge transformation 

djhij - A& - dii o = $ om - U- 1 djTij , hii + (d- 2 )di£i - di o = </> hom - O-'Tii , (2.2.43) 


and we see that we can kill the homogeneous solutions with the choice 

1 


£o = - 


2(d- 1) 


(d - 2)A~ 1 d i $ om + 


ihom 


e* = -a 


-i 


^° m + W=T) dl ( (d " 2 ) A_1 ^^ hom + ^ hom ) 


(2.2.44) 

(2.2.45) 


Therefore, djhij are ha are fully determined by the source and thus carry no degrees of freedom. 
The only unconstrained components are the spatial transverse-traceless part h\j, i.e. djh\j = 0 
and h\l = 0, whose equation of motion is the spatial transverse-traceless part of (2.2.36) 


Uhfj = -If , (2.2.46) 

and correspond to Nf = d 2 — d — 2 degrees of freedom. As in the spin-1 case, the fact that ha 
and dihij are apparently dynamical is a gauge artefact. By starting all over again but rather 
considering the gauge 

dihij = 0, (2.2.47) 

we find indeed that they both obey non-dynamical equations and thus have that Nf = 2N$, 
with N(f = 2 in the D = 4 case. 


2.3 Canonical formalism 

The most rigorous way to perform the degree of freedom count and to study the stability of 
a theory is through the canonical formalism (see for instance [31,32,39,44,45] for the case of 
massive gravity). It is also the most suited way to unambiguously see that Nf = 2A r c j for gauge 
theories. Here we assume that the reader has the basic knowledge of constrained Hamiltonian 

systems, i.e. Dirac’s algorithm, weak equality 7 , first/second class constraint terminology 8 9 etc. 

9 

7 Weak equality “ss” holds for ‘ — up to the addition of terms that are zero on the constrained hypersurface”. 

S A constraint is “first class” if its Poisson bracket with any other constraint and the Hamiltonian is weakly 
zero. A constraint that is not first class is called “second class”. 

9 See for instance [81] for details on this formalism. 
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2.3.1 Spin 1 

Massive 

Since A o has no kinetic term r \j Al in (2.2.1) 


S = d D x 


2 A 2 — — FijFij — — m 2 A? + Aiji — A^Aq + — ( diA$ ) 2 + — tti^Aq — Aqjq 


(2-3.1) 

we Legendre transform only with respect to Aj. The conjugate momenta (the electric held) 
read 

d l 

Ei — ■ = Aj OiAfl , 

oAi 

so that the action in canonical form is 


S= d D x 


E\Ai — H[Ei, Ai, j 4 q] 


(2.3.2) 

(2.3.3) 


where 

7~L[Ei, Ai, Mo] = ~ E 2 + — F^Fij + — m 2 A 2 — Aiji ~ Aq ( diEi — jo) — — m 2 Aq , (2.3.4) 

is the Hamiltonian density. Since Aq is an auxiliary non-dynamical held, it can be integrated- 

out in order to provide a clearer picture of the dynamics, i.e. it can be replaced with the 
solution of its own equation of motion 

T~L[Ei, Ai] = — E 2 + — F^Fij + — m 2 A 2 -}-^ (diEi) — Aiji n diEi jo + 0(j~). (2.3.5) 

2 4 2 m z m^ 

It is then clear that we have Nf = 2 Nq = 2d degrees of freedom forming two vectors Aj and Ej 
under SO(d). 


Massless 


In the case m = 0, we have to go back to (2.3.4) and observe that Aq becomes a Lagrange 
multiplier enforcing the Gauss constraint 


Q = diEi - j 0 = 0. 

We now enter Dirac’s constraint formalism so let us dehne the Poisson bracket 

' SO SO' SO' SO ' 


{0,0'} = / d d - 


SAiSEj SAiSEi 

and let us also smear the phase space functions of interest 

A(f] = J d d x fiAi, E\g\ = j d d x gi Ei, G[Aq] = f d d x H 0 Q , 

so that time-evolution is given by 10 

6 = -{H,Oj + dtO. 


(2.3.6) 

(2.3.7) 


H= / d d xU, 


(2.3.8) 


(2.3.9) 


10 Note that the second term here is needed because O can depend on the source which has its own time- 
dependence. The dt operator will of course not act on the smearing fields /;, gi and Aq. 
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We then have that, for a conserved source, Q is first class 

G[Ao\ = 0 , {G[A 0 \,G[A' 0 }} = 0 (2.3.10) 

so Aq is not determined by the equations of motion and G[Ao] generates abelian gauge trans¬ 
formations on phase space 

5A[f} = -{G[A 0 ],A[f}} = - J d d x fidiA 0 , 6E[g] = - {G[A 0 ],E[g]} = 0, (2.3.11) 

which for Ai and £) translate into 

5Ai = -d, A 0 , 5Ei = 0. (2.3.12) 

This implies that diAi is pure-gauge, the simplest example being the Coulomb gauge diAi = 0. 
Along with Q = 0, we thus get that the longitudinal parts of Aj and Ei are non-dynamical, 
leaving only the transverse parts as the Nf = 2(d — 1) degrees of freedom of the theory. 
Moreover, here we can clearly see why Nf = 2A r ( j. Indeed, the fields with constrained initial 
conditions Aq and diAi appear as a Lagrange multiplier Aq, which is thus totally arbitrary and 
in fact represents the gauge parameter, and a canonical couple subject to a constraint 

(spatial differential equation) and a gauge transformation on phase space. Thus both Ao and 
diAi are non-dynamical and thus Nf = 2 N<f. Finally, note that in both the massive and massless 
cases, the quadratic part of the (gauge-fixed for m = 0) Hamiltonian is positive definite, so 
these theories are stable. 


2.3.2 Spin 2 

Massive a / 0 

Since the components have no kinetic term in (2.2.13), we first remove all time-derivatives 
that act upon them by integrating by parts 


S = 


d D x 


1 


^ - hfi ) - ^ ( dih jk Y + (dihijY - dihijdjhkk + - (dihjj) 


1 


1 


2 hijdihjo T 2 hadjhjo T djh-jjdj //qo d f J i.jj djI /qq T 2 d^ t h^^djhjo 
~\ m2 i h ij - (! + a)hl + 2(1 + a)hoohn - 2 h 2 0i - oth 2 m ) 

+hoo?oo — 2/iojToj + hijTij} , 

and then Legendre transform only with respect to hfj. The conjugate momenta read 

dL 


(2.3.13) 


TTi-i — 


dh 


— hij fiijhfck 2 d^ih.jy) T 2 Sijd^hk o , 


ij 


and the inversion gives 

hij = vrjj ^ij'^kk T 2cl(jhjjQ 
so that the action in canonical form reads 


(2.3.14) 


(2.3.15) 


S= / d D x 


T^ij hfj T~L [hij , ITij , L()0, tl()i 


(2.3.16) 
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and the Hamiltonian density is 


'Riji /loo? hoi] — ^ ^ 2 {d%hjk) (dihij) + dihijdjhkk ^ {dihjj) 

+\ m<2 ( h ij “ (! + ot)hl - 2/4 - ah 2 00 ) - hijTij (2.3.17) 

+^00 {didjAha H- (1 ha ^00) 2 /ioi {dj^ij H - 7q^) • 


We see that ho, is an auxiliary field that appears quadratically whatever the value of a, so we 
can integrate it out as we did for Aq in the spin-1 case to get 


T-L[hij, Kij, h 00 \ = ^ { n ij ~ + ^*2 + \ (di h jk) 2 ~ {dihij) 2 + dih;j<)jh kk 

~\ (dihjj) 2 + 1 m 2 ( hfj - (1 + a)h 2 i - ah 2 00 ) ( 2 . 3 . 18 ) 

2 

+hoo {d,dj hij — Ah,, + (1 + a)m 2 ha — Too) — hijTij + —^ diir^T^j + 0{T 2 ). 

Now, for a / 0 we have that hoo is also a quadratic auxiliary field, so we can integrate it out 
as well 


'H h;j. -ij\ = t (^ 2 . _ _J_ ^ 2 ) + _L (diTTij ) 2 + ^ (dihjkf - {dihij ) 2 + (>;hij()jh kk - ^ {dihjj ) 2 

+ 2m?a ~ + (1 + a)m 2 hn ] 2 + ^ m 2 {h 2 j - (1 + a)/i|) (2.3.19) 


1 2 

—huTa -n— {didjh^ — Aha + (1 + a)m 2 ha ) Too + ^2 di^ijToj + 0{T 2 ) 


9 

m z a 


To see the instability in this setting we can harmonically decompose n. 




irij = ^SijTr + [didj - ^<% a ) l + d^Vj) +Uj , <^ = ^ = 0 , (htij - 0 ( 2 . 3 . 20 ) 


and trade l for the more convenient variable 


II = 7T + {d — 1) Al , 

to get that the part of hi which is quadratic in 7 r^- in the scalar sector reads 


n 


1 


1 


1 


e ’( 7 r scai.) 2 d{d — 1) d 2 m 2 ^ ^ d(d — 1) 


7rII. 


(2.3.21) 


(2.3.22) 


We see that hi is thus not positive-definite, or that by completing the square there is a negative- 
definite term. Since this occurs at the level of the conjugate momenta, we have that the 
corresponding mode is a ghost. The degrees of freedom are the hij and 7 T,j components, that 
is, a total of Nf = 27V c l = d 2 + d = D 2 — D. 
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Massive a = 0 

So let us go back to (2.3.18) and move on to the a = 0 case where /too becomes a Lagrange 
multiplier enforcing the constraint 


C t = didjhij - (A - m 2 ) ha - T 00 = 0. 
Defining the Poisson bracket 

{0,0'} = j d d x 

and the smeared observables 


50_50{_ _ 50' 50 
5 hjjj 5 7r jj 5 hij 5 7t jj 


(2.3.23) 


(2.3.24) 


h[f] = J d d xfijhij, 7 r[g} = J d d xg l3 TT l3 , Ct[h 00 ] = J d d xh 00 C t , H = J d d x7i 

(2.3.25) 

we get that Ct is second class (using (2.3.9) and for a conserved source) 


C t [hoo] = - d d x h 00 C = C'[hoo], C = d l d ] 7r l] + - m 2 ira + diT 0i , (2.3.26) 


so it is a priori not conserved under time evolution. To repair this, we must therefore consider 
C' as an additional (secondary) constraint and append a term qC to the total Hamiltonian 
density 7-L 

n^n + qC', (2.3.27) 

with q a Lagrange multiplier. Now, demanding that C be conserved fixes hoo 

C'[q] ~ C t + dm 2 (hoo - ha) - T « 0, (2.3.28) 

so we can choose 

hoo = hjj + T, (2.3.29) 

which is nothing but (2.2.20), and the Hamiltonian density now reads 

U[hij,^ij,q\ = ^ (tt 2 3 - 7 xl) + ~~2 (2.3.30) 

+ 2 (di^jk) 2 — (dihijY + - ( dihjj ) 2 + - m 2 (h 2 ^ + h|) 

+<7C 7 — hijTij — hjjToo + + ^2 + 0(T 2 ). 

We must finally demand that Cj be conserved under time-evolution with this new Hamiltonian 

Ct~C'- (2.3.31) 


which ends up hxing q = 0. The resulting Hamiltonian preserves the constraints Ct,C under 
time-evolution as long as they are imposed on the initial conditions. These constraints kill 
precisely the degree of freedom which causes the instability in the a / 0 case and make the 
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Hamiltonian positive-definite. Indeed, by performing a harmonic decomposition of 7Tjj (2.3.20) 
and also hij 


hij 


1 

d 


Sij4> + 



1 

d 



^ + d(iPj) + T ij > 


diPi = Tii = 0 , diTij = 0 , (2.3.32) 


we can actually solve the constraints explicitly and get that they relate the traces to the 
longitudinal parts 


(d- 1 

v~ 


A — m 


<t> 


/d- 1 


A + m 


7r 


d- 1 
d 


A 2 A — Too, 


-(d-1) 


/d- 1 

v~ 


A 2 l + d.^ 


i0 


(2.3.33) 

(2.3.34) 


Actually, as in the a ^ 0 case, one can use more convenient combinations instead of the 
longitudinal modes 

<3? = AA — cj), n = 7r + (d — 1) Al , (2.3.35) 


so that, using the constraints, we can express <j), A, 7r, l in terms of 4> and n. We get 


1 


Til 


d — 1 


Ad- - Tr 


00 


AA = 

m- 


d — 1 


A<f> - m 2 4> - T ( 


oo 


and 


d — 1 


7T = — - 


777. 


— An + diTio 
d 


A 1 = ^ 

m. 


— An + ——— m 2 n + diTio 

d d — 1 


and the action reads 


where 


S = d D x 


2 n^ + \ diVjdiPj + tij Tij - H 


— ^scalar T ^vector T ^tensor ■ 


(2.3.36) 

(2.3.37) 

(2.3.38) 

(2.3.39) 


with 


li 


^scalar — 

^vector = 
^tensor = 


1 9 d l/o . \ 1 _ d 1 / a \ 

n- + 4> (m — A) nAg + —— $ (p - Act) 


2d(d- 1) 
d — 1 


dm 2 


d 


+ 


dm 2 


TA (p + Act ) + 0(T 2 ^ 


1 


diVj (m 2 - A) SiUj + ^ r?r 2 (<9j/3,) 2 + UjAg, + ^ ftAffj + 0(T 2 ) 


4r?7 2 

J tij + \ Tij (m 2 - A) Tij - Tijffij + 0(T 2 ). 


(2.3.40) 


The quadratic part is positive definite and the theory is thus stable, with iVf = 2Aj = d 2 + 
d — 2 = D 2 — D — 2 degrees of freedom. These correspond to 4>, /%, and their conjugate 
momenta. 

11 The harmonic variables of the source p,p, q, a, qi,c?i, atj are defined in (2.4.18) and the conservation equation 
in terms of them reads (2.4.36). 
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Massless 

We can finally proceed to the massless case where fiQ t becomes a Lagrange multiplier as well. 
We must therefore go back to (2.3.17) with m = 0 and define the constraint imposed by as 

C s [/loi] = / d d x hoi Ci (2.3.41) 

Observe that the secondary constraint C of Fierz-Pauli theory (2.3.26) actually reduces to d 
in the m —> 0 limit. The only non-trivial Poisson bracket arises in 

Ct[hoo] = — C s [dihoo] « 0, (2.3.42) 

so the system is now first class and the h are not determined by the equations of motion. 
Rather, they serve as the gauge parameters of the gauge transformations generated by Ct and 
C s on phase space 

Sh[f] = -{C t [h 00 ],h[f}}-{C s [h 0i ],h[f}} = -Jd d xf ij d i h j0 , (2.3.43) 

8ir[g\ = - {C t [h 00 \,ir[g}} - {C s [h 0 i],Tr[g]} = j d d xTTij (didj - 5ijA)h 00 , (2.3.44) 

which for htj and 7Tjj imply 

Shij = -d (i h j)0 , 5-itij = (didj - 5ij A) h 0 o • (2.3.45) 

These can be used to fix the gauge to 

dihij = 0, nu = 0, (2.3.46) 

which, along with Ct = 0 and Ci = 0, imply that and 7 Tij are both transverse-traceless. The 
degrees of freedom of the theory are thus IVf = 2 = d 2 — d — 2. The Hamiltonian density in 
this gauge is positive-definite 

^ = 2^ + 2 ( d i h jk) 2 + 0(T), (2.3.47) 

so the theory is stable. 

We can now note that the combinations 4? and n defined in (2.3.35), that were used in 
the treatment of the massive theory, are actually invariant under (2.3.45) 12 . We thus have 
that the scalar sector of the FP theory a = 0, once the second class constraints are solved, 
becomes invariant under the transformations generated by the massless constraints Ct,C s . It 
is important to note however that not all of these constraints appear in the massive theory 
and, for those who do, they are second class for m ^ 0. This means they do not correspond 
to gauge symmetries, since there is no totally undetermined held playing the role of the gauge 
parameter. Thus, by discussing the transformations of the scalars in the massive theory we are 
actually comparing objects in two different theories. 

12 This will become clear in the next section where we will deduce the transformations of the harmonic variables 
under the gauge symmetry. 
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Nevertheless, the fact that these modes are gauge-invariant is again a property of the scalar 
sector of FP theory alone, since this is not the case of the vector sector, where Vi is gauge- 
invariant but fa is not, and it is also not the case in massive electrodynamics. Moreover, it 
is also not the case for the ghost scalar when a / 0, so this has all the characteristics of the 
issue that was discussed in the previous section: on the FP point a = 0 there is something that 
looks like a gauge symmetry but that is actually not. 

Another interesting feature we can already see here is the vDVZ discontinuity of the a = 0 
theory. Indeed, sending m —> 0 in (2.3.40) effectively neutralizes the vector modes but the 
scalar mode remains, that is, one more dynamical field that in the m = 0 case. 

It seems that the use of harmonic variables has helped our understanding of this apparent 
symmetry issue and has generally made the dynamics of the theory more transparent. Unfor¬ 
tunately, in the canonical formalism the action is a bit too crowded because of the presence 
of the conjugate momenta, so this is still not the optimal way to understand the theory. We 
therefore now propose to use harmonic variables, but in the Lagrangian formulation. 

2.4 Harmonic formalism 

In this section we consider the d-harmonic decomposition, but at the level of the Lagrangian 
formulation. This will allow us to explore the above mentioned “residual gauge symmetry” of 
Fierz-Pauli theory, but it will also make the dynamical structure of the theory more transparent. 
Moreover, this formalism is also easily applicable in the case of a de-Sitter background. It will 
thus allow us to understand in a different language a number of results in the literature on the 
degrees of freedom of massive gravity over de-Sitter. This section is based on original work 
from our group [79]. 

To briefly introduce the harmonic decomposition, let us start by noting that at each space- 
time point x , the field components form irreducible representations of SO(d), i.e. Aq(;x) is a 
SO(d)-scalar, Ai(x ) a SO(d)-vector, the traceless part of hij(x) is a SO(d)-tensor and so on. If 
we now consider the full group of isometries of d-dimensional space, i.e. the Euclidean group 
ISO(d) of rotations and translations, then a held is no longer seen as an infinite collection of 
independent SO(d) representations, but as a finite collection of irreducible representations of 
ISO(d). For instance, we have that inside of Aj there hides a scalar under SO(d), that is, d{Ai. 
We can therefore split Ai into its scalar part diAi and its transverse vector part A\, obeying 
diA\ = 0, which obviously do not mix under translations, nor under rotations since the latter 
commute with <9j. For tensors one can analogously decompose the traceless part of hij into a 
scalar, a transverse vector and a transverse-traceless tensor. 

We will therefore refer to “d-scalars”, “d-vectors” and “d-tensors” for these irreducible 
representations of ISO(d), while the irreducible representations of SO(d) will be referred to as 
“SO(d)-vectors” and “SO(d)-tensors”. Note that d-vectors and d-tensors are thus automatically 
transverse. The basic advantage of the harmonic decomposition in our analysis lies in the 
following fact: the massless dynamical fields form the highest possible irreducible representation 
of ISO(d), while the massive ones form the highest possible representation of SO(d). This 
formalism is thus ideal for observing the activation of modes by mass. 
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2.4.1 Spin 1 

We start by splitting Ai and j r into longitudinal and transverse parts 


(2.4.1) 

(2.4.2) 


A 0 = ip , Ai = diX + /3i, difc = 0 , 

jo = ~P , ji = did + di , didi = 0 , 

with the inverse map being 

A = A -'diAi , pi = PijAj , (2.4.3) 

and so on for j t . where P t j is the projector on the subspace of d- vector fields (transverse 
SO(d)-vectors) 

Pij = 6ij - diA^dj , P^P^ = P t J , diPij = 0. (2.4.4) 

Note that the harmonic variables are therefore spatially non-local combinations of the original 
fields. In terms of the harmonic variables the gauge transformation (2.2.2) reads 

Sip = -e, sx = -e, 5 Pi = o, (2.4.5) 

so that Pi is gauge-invariant, while A and ip can combine to form the gauge-invariant combina¬ 
tion 

'& = ip-X. (2.4.6) 

On the other hand, current conservation = 0 translates into 

p = —Ad , (2.4.7) 


and this equation will be implicitly used every time some ~ p term appears. We then get that 
the action (2.2.1) can be written as 


S = d D x 


~ dfiPid^Pi - ^ m 2 PiPi + ^ diWdi& + ^ m 2 {if? - diXdiX ) + pd> + Pidi 


(2.4.8) 

where we consider A as the independent fields, while is just a shorthand notation for 
the combination ip — A. Here one might be tempted to use 4/ as an independent field instead of 
ip or A, but should refrain from doing so. This is because T depends on time-derivatives of the 
original fields A This in turn implies that the initial conditions of T are not determined, since 
they require the knowledge of the initial value of A. So keep in mind that one can only consider 
field redefinitions that preserve the initial data for the Cauchy problem to remain well-posed. 


Massless 

This subtlety being mentioned, the first thing to notice in the above action is that for m = 0 it 
is explicitly gauge-invariant since it depends only on the gauge-invariant quantities T and Pi. 
The latter obeys a massless Klein-Gordon equation 

UPi = -di , (2.4.9) 
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and thus constitutes the 2(d — 1) degrees of freedom of the theory. The equation of motion of 
-0 is the Poisson equation 

A'h = p , (2.4.10) 

while the equation of motion of A is the time-derivative of it. Pay attention to the way in 
which gauge-invariance neutralizes the longitudinal mode A in this setting. The latter does 
have a kinetic term A 2 in the action, which would naively make it dynamical, but the fact 
that it enters only through the combination T and that the latter ultimately obeys a purely 
spatial equation makes A non-dynamical. Therefore, in the massless case, it turns out that we 
can effectively consider T as an independent variable and vary the action with respect to it 
because the initial conditions of A are pure-gauge so the initial data of T are defined. This will 
no longer be true in the massive theory. 

Until now, the spatial differential equations we obtained always concerned gauge-dependent 
fields, so that we did not need to worry about questions of instantaneous response to a source. 
Here, we are witnessing an equation that involves only spatial derivatives for 'k, which is a 
gauge-invariant variable. As anticipated in section 2.1.2, we see however that T is a spatially 
non-local functional of the original fields, so that it cannot be measured instantaneously to 
begin with. 

Massive 

Turning on the mass m / 0, we first see that the gauge-invariant variables are not sufficient 
to describe the mass term since the latter breaks the gauge symmetry. This means that the 
equation of A will not be implied by the one of ip any more and therefore that its time-derivatives 
will now make it a dynamical field. The equation of motion of [3 t is now a massive Klein-Gordon 
equation 

(□ - m 2 ) /3i = -ai , (2.4.11) 

while the ones of ip and A read 

(A — m 2 ) ip = p + AA , (<9 2 + m 2 ) A = a + ip. (2.4.12) 

Then, isolating ip in the latter and plugging the result in the time-derivative of the equation of 

ip, we get 

(□ — m 2 ) A = —a , (2.4.13) 

so A corresponds to the additional 2 degrees of freedom one gets when m^O, On the other 
hand, solving for A in its own equation of motion and plugging the result in the time-derivative 
of the equation of ip, we get that ip is non-dynamical and that its initial conditions are totally 
determined by the ones of the other fields 

(A — m 2 ) ip = p + AA , ip = AA. (2.4.14) 

One could therefore integrate-out ip, and redefine the longitudinal modes by a spatially non¬ 
local operation 13 

A = y^A (A — to 2 ) -1 A , a= y/A (A - m 2 )” 1 a , (2.4.15) 

13 Note that the square-root is real because A (A — m 2 ) 1 is positive-definite as it can be seen by using its 

Fourier representation. 
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to find a spatially local action for the dynamical fields only 


5 = d D x 


dfiPid^Pi - * m 2 Pi/3i + pm + m 2 f-^d^Xd^X - ^ m 2 X 2 + Ad ) + 0(f 


(2.4.16) 

Now only the dynamical fields appear in the action. This was already the case in the canonical 
formalism after having integrated-out Aq, but the advantage here is that the action is analytic 
in m 2 so that one can study the m —> 0 limit unambiguously. Note that the new dynamical 
mode one gets in the massive case (here A or A) is not gauge-invariant, as one would expect in 
a massive theory, and disappears in the m —> 0 limit. 

Finally, observe that, after having eliminated the non-dynamical field, the dynamical ones 
come with the Klein-Gordon kinetic terms, even though they are not representations of the 
Lorentz group. This is because Poincare invariance implies the standard relativistic dispersion 
relation E 2 = m 2 + p 2 for the dynamical fields. 


2.4.2 Spin 2 

We start by defining the harmonic variables 

hoo 

h 0 i 

hij 

Too 
Toi 

T ■ 

j-ij 

where 

diXi = diPi = Tu = djqi = di<Ji = an = 0 , diTij = d t a tJ = 0 , 

while the inverse relation is 

X = /XT l dihoi , 

Xi = Pijhoj > 

4* — hii j 

A = j- A -1 [ha - dA~ 1 d i d j hi j ] , 

Pi = 2A ~ l Pijd k h jk , 

7~ij — Pijklhkl j 


-•P, 

dix + Xi, 


- 5ij <f> + ^didj — — SijA^j X + d(iPfi + Tij , 


P, 

-diq - qi , 


5ij p + [didj - ^ a + d(i°j) + a ij , 


(2.4.17) 


(2.4.18) 


(2.4.19) 


(2.4.20) 

(2.4.21) 

(2.4.22) 

(2.4.23) 

(2.4.24) 

(2.4.25) 


and so on for the components of T fW , where Pijki is the projector on the subspace of d-tensors 
(transverse-traceless SO(d)-tensors) 


p kl _ p k p l _ _ p . -pkl p nm p kl _ p kl q p. _ r\ 

r ij — r (i ^ r ij r ’ r ij r nm r ij i u i r ijkl u 5 


Puki = 0. (2.4.26) 
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Decomposing the gauge parameter as well 


Co — A, Ci — + Bi , d{Bi — 0 , 


(2.4.27) 


we get that the gauge transformation (2.2.15) reads 

Sip = -2 A, (2.4.28) 

<5* = —A - B , (2.4.29) 

S X i = ~Bi, (2.4.30) 

5<j) = -2A B, (2.4.31) 

SX = -2 B, (2.4.32) 

SPi = -2 Bi, (2.4.33) 

Snj = 0, (2.4.34) 


so one can form the following independent gauge-invariant combinations 

^ = V’ - 2 X + A , $ = AA - <j>, Ei = X i-^Pi, (2.4.35) 

known as “Bardeen potentials” [82], of which is already known from the previous section. 
Finally, the conservation equation d^T^ v = 0 gives 

d \ 1 

p = -Aq, q = ~P - -j— Act, <?; = --Act;, (2.4.36) 


and, as in the spin-1 case, these will be implicitly used whenever we have a time-derivative 
acting on a source component in the subsequent computations. The action (2.3.13) in terms of 
these variables reads 


S = 


d D x 


+ m)2 + w ) + {d ^ )2 ~ 2 d ^ d>1 ^ 


1 . 

— m‘ 
2 


(-$ 2 + 24>AA) + 2 (AA - $) V’ - a ($ + ^ - AA f 


^2 o.2 , hflo\2 2 


“2 (<9*x) — 2%j + - (diPjY + Ty ) + 'Fp — 4>p + 2 ^, 4 % + Tycr, 




. (2.4.37) 


As in the case of electrodynamics, note that for m = 0 only gauge-invariant quantities appear, 
thus making the symmetry manifest. Again, we cannot consider \F and Ej as independent 
variables with respect to which we could vary the action because they contain time-derivatives 
of the original fields and their initial conditions are thus not defined. This is however not the 
case of <b, so we choose to consider i/j, x, 4?, A, Xi, Pi, T ij as the independent fields, while T and 
Ej are mere shorthand notations. 


Massless 

So let us start with the massless case m = 0 and compute the equations of motion. In the 
d-scalar sector, the ones of i/j and 4> read 

A<f> = T d p, i&- d 2 A<f> - AT = - - p, (2.4.38) 

a— 1 a a— 1 
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respectively, while the ones of x and A are the first and second time-derivative of the former. 
To simplify the second equation we note that by taking the double time-derivative of the first 
one and using (2.4.36) we get 

A4 = d4l' i = -rfTT A 'i = rfVT A (r'+ 1 )r A -) • < 2A39 > 

so that the equation of actually gives 

A 4/ = —^p + Acr. (2.4.40) 

In the vector sector we have the equation of motion of Xi 

ASj = qi , (2.4.41) 

and the one of /3i which is its time-derivative, while finally for the tensor modes 

Unj = -aij . (2.4.42) 

Therefore, the Bardeen variables are physical but non-dynamical fields, thus leaving 

the d 2 — d — 2 components of Ty,- and fjj as the only degrees of freedom/dynamical fields of the 
theory. 


Massive 


Let us now turn on the mass m 7 ^ 0 in which case the equations of motion of %, A, j3i are 
no longer implied by the ones of and Xi- Since there are a lot of variables now, it is 
not particularly illuminating to work at the level of the equations of motion. Rather, we can 
do directly as we did in the end of the spin -1 case, that is, to integrate-out at the level of the 
action the manifestly non-dynamical modes, i.e. those without time-derivatives. For notational 
simplicity, we will consider the d-scalar, d-vector and d-tensor sectors separately. As far as the 
last two are concerned, the procedure and properties are exactly analogous to the ones of the d- 
scalar and d-vector modes in massive electrodynamics. For the d-tensor sector there is nothing 
to do, we simply have that it becomes massive 


5, 


tensor — 


d D x 


d^Tijd^Tin ~ 


13 




(2.4.43) 


For the d-vector sector the non-dynamical field is Xii so integrating it out in (2.4.37) and using 
the spatially non-local redefinition 


Pi= \J A (A - m 2 ) 1 Pi , cfj = v/ A (A - m 2 ) 1 


,2 ^—1 . 


we get the spatially local action 


S , 


m 


vector — 


d D x 


dpPid^Pi - * m 2 l3 2 + ftd* 


(2.4.44) 


(2.4.45) 


As for the longitudinal mode in the spin -1 case, we have that the d-vector mode activated by 
the mass (3i is not gauge-invariant and smoothly disappears in the m —» 0 limit. The novel 
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feature in the spin -2 case, as already anticipated in the previous sections, lies in the d-scalar 
sector. We can start by integrating-out x to get 


'S'scal. 


d D x 


d-If 1 ■ 


d- 1 


$ 2 _ 

2 dm 2 


w + 


2 d -2 


2d 


(d+ di^di ('i/j + A 


(2.4.46) 


— m' 
2 


(-$ 2 + 24>AA) + 2 (AA - $) $ - a ($ + ^ - AA) 2 ^ 


+iftp — + 


2 (d- 1 ) 
dm 2 


4>A p + 


d- 1 


d 


Act +AA p4 


d- 1 


At this point, it is convenient to trade ?/> for the trace 

h = hf = -i/)-$ + AA, 

in which case we have 


A<j + 0(T 2 ) 


(2.4.47) 


‘S'scal. 


d D x 


d—1 ( 1 ■ 


d 





d- 1 
dm 2 


w - 


2 d + 2 


2d 


(.di $) 2 - d t $d t h — A4> (A + AA 


— m 
2 


d + 1 $ 2 + 2 (AA ) 2 - $AA + 2h ($ - AA) - ah 2 


d 


d 


-/ip - (p + p) + 


2 (d- 1 ) 
dm 2 


4>A ( p + 


d — 1 


d 


Act + AA p + p + 


d- 1 


d 


Act 


+0{T 2 )\ . 

We next integrate-out h , i.e. we solve the equation of motion of h 


h = — - 


1 /d — 1 


am. 


d 


A4> — m 2 <f> + m 2 AA — p I = — 


am. 


G, 


(2.4.48) 


(2.4.49) 


plug it back inside the action. Choosing the above defined G and 4> = 4> + G/m 2 as the 
independent fields instead of {<f>, A} we get a diagonal action 


Asca.l. 


d — 1 


d D x 


— ^ 1 m 2 1 > 2 - 4» (p - Act) 


(2.4.50) 


G - 1 + »(l- + ii /tt) m2 ° 2 - ^ ° ^ - drt ) +o(t2) 

Note that G is proportional to the on-shell trace h, so in particular it is a Lorentz scalar 
on-shell. As a check, we can compare its equation of motion 


1 + d(l + 1/a) 9 \ „ m 1 - , . 

□ H-)— , 7 m |G = — 7 —7 (p - dp) , 


d — 1 


d — 1 


(2.4.51) 


with (2.2.19), using (2.4.49), and see that they match exactly. We have thus shown what we 
had claimed in section 2 . 2 , i.e. it is indeed the trace which is the unstable mode and, more 
precisely, it is a ghost with mass 


m 


2 

ghost 


1 + d (1 + 1 /a) 2 


(2.4.52) 
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Unlike the case of the d-scalar sector or massive electrodynamics, here the action is non-analytic 
in both m 2 and a, if our fields are combinations of the h^ u that are analytic in these parameters. 
In the a —> 0 limit, with m kept fixed, we see that m g host diverges, while the coupling to the 
source remains constant, so we effectively have G = 0 and thus also <f> = <P It is also instructive 
to see how this condition appears when working directly at the a = 0 point. So let us go back 
at the level of (2.4.53) where now h is a Lagrange multiplier. Integrating it out will therefore 
result in fixing another held, which we choose to be A. The equation of motion of h is then 
simply G = 0 and, plugging this inside the action we are indeed left with 14 


S'scal. 


d- 1 
d 


d D x 


— - - - ?n 2 <h 2 

2 M 2 


$(p- Aa) + Q{T 2 ) 


(2.4.53) 


At this stage we can make a series of remarks. First, we have now reached in this formalism 
the same conclusion we did in the previous section, namely, that the scalar degree of freedom 
in FP theory is gauge-invariant and it does not go away in the m —> 0 limit. Second, what 
we have gained here with respect to the canonical formalism is a clearer picture of the whole 
(m 2 ,a) plane provided by (2.4.50). Indeed, we are now able to see more clearly the fact that 
the vDVZ discontinuity as m —> 0 arises only for a = 0. If we go back to (2.4.50) and take 
m —> 0 with a/0 fixed, we get 

<f> = + (2.4.54) 

rri m. 

so that the kinetic terms in the action cancel out and we retrieve the same number of degrees 
of freedom as in the massless theory. We now understand that the vDVZ discontinuity and 
the fact that the FP d-scalar mode is gauge-invariant are intimately related features. Indeed, 
in the m —> 0 limit we retrieve the gauge symmetry so that only gauge-invariant combinations 
can survive. For instance, in the case of electrodynamics, we have that the longitudinal mode 
must disappear since it is not gauge-invariant. Here, for a/0we have that the d-scalar modes 
are not gauge-invariant and must thus disappear in the m —> 0 limit. On a = 0 however, since 
<h survives the m —> 0 limit, it must be gauge-invariant. 


Hidden gauge symmetry 

It is now the appropriate moment for discussing the fact that FP theory seems to have something 
that looks like a symmetry but which is not quite it. This was the novel result of our paper [79], 
where (2.4.53) was derived, and we have therefore elaborated on the physical significance of <h 
being gauge-invariant. 

Already from section 2.2.2 we know that, for on-shell configurations h^ w . the action is 
invariant under gauge transformations of the form = d^O (see eq. (2.2.23)), but not the 
equations of motion. The fact that this holds for on-shell configurations in equivalent to the 
fact that here some equations of motion have to be used, i.e. the ones of the non-dynamical 
fields, in order to get the invariance. The additional information we gain here with respect to 
section 2.2.2 is that the sector of the equations of motion which corresponds to the dynamical 
field <f> is also gauge-invariant, a feature which is not visible when we work with h^ u . Moreover, 

14 After deriving this result we were informed by S. Deser (private communication) that a similar form was 
obtained in an old and little known paper [83]. Interestingly enough, this paper appeared in 1966, that is 14 
years before the introduction of gauge-invariant variables by Bardeen [82] in cosmological perturbation theory. 
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the transformation considered in section 2.2.2 was one-dimensional, whereas here we have two 
gauge parameters in the d-scalar sector of (2.4.27), that is, A and B. Trading the former, 
for A = A — B, we can write the corresponding gauge parameter (2.4.27) 

, (2-4.55) 

so that B corresponds to the 0 parameter considered in section 2.2.2, while A parametrizes the 
additional transformation under which (2.4.53) is invariant. 

Now observe that, if we perform the gauge transformation (2.4.55) at the level of the original 
action, we get a new action that depends on the gauge parameters 

SaAKA = S[V] + A 5[V, A B] , (2.4.56) 

where AS / 0 for general h /w , so that this is not a gauge symmetry. If we decompose hp U 
harmonically, we have that Sa,b will correspond to S with ip, x, 4>, A replaced by 

ip — 2A , x-A-B, <h, A-2 B, (2.4.57) 

respectively. Then, since these are simply the original variables that have been shifted, integrating- 
out the non-dynamical ones will automatically yield again (2.4.53), i.e. whatever the values 
of A and B. Thus, although the actions Sa,b are not the same, they do reduce to the same 
action once the non-dynamical fields are integrated-out. In conclusion, although the action is 
not invariant under the A, B transformation, the physics is. It is in this sense that this is a 
“hidden” gauge symmetry. 


2.4.3 de-Sitter background 

As a final display of the power of the harmonic formalism, let us apply it to the case where 
the background space-time is de-Sitter and see whether it is still a gauge-invariant field which 
propagates in the d-scalar sector for a = 0. This is not guaranteed a priori because on flat 
space-time we concluded that it was the vDVZ discontinuity which was responsible for this 
and, as it turns out, there is no discontinuity on a de-Sitter background [31,32], 

It is convenient to work in the following coordinates 

500 = -1 , goi = o, gij = a 2 5ij , a = e Ht , (2.4.58) 


where apt) is the scale factor and H the (constant) Hubble parameter. We consider directly 
the case case of the linear massive spin-2 field and obtain its action by linearizing the Einstein- 
Hilbert action with cosmological constant 

A ee d(d - r> H 2 , (2.4.59) 

around the corresponding de-Sitter solution. Also appending a FP mass term and a linear 
source this gives 


S 



x a 


d 


1 

~2 

+dH 


2 - l -m 2 (- h 2 ) + h IJV T ,w , 


(2.4.60) 
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(2.4.61) 


where the non-vanishing components of the Christoffel symbols T P IW are 

rh 0 = H5) , r% = Hg Ki . 

For m = 0, we have the gauge symmetry 

Sh^ = -V^ - , (2.4.62) 

provided the source satisfies the background-covariant conservation equation = 0, while 

for m 2 = (d — 1 )H 2 we have a one-dimensional gauge symmetry 

= -V M V,0 - g^ u H' 2 0 , (2.4.63) 

provided the source satisfies 

V #1 V„T # “' + H 2 T = 0 . (2.4.64) 

The latter case is known as the “partially massless” theory because the gauge symmetry elim¬ 
inates the d-scalar mode. It is convenient to define the following field strength [84] 

F^up^y^Kp-Vyh^p, Fp = g up Fp Up , (2.4.65) 

which is invariant under (2.4.63) and in terms of which the action becomes 

5 = J d D xa d ( Fp V pF^ p - 2FpF^) - ± M 2 (h^h^ - h 2 ) - hp U T^ , (2.4.66) 

where 

M 2 = m 2 -(d- 1 )H 2 , (2.4.67) 


is precisely zero for the partially massless theory. This representation is quite elegant from the 
point of view of the partially massless theory M = 0 because it exhibits many analogies with 
electrodynamics: there is a one-dimensional gauge symmetry, the theory can be written as 
the square of some gauge-invariant field strength and there is a cohomological chain structure 
between the gauge parameter 0, the field hp U and the held strength Fp Vp [84,85]. Then, M 
appears as the mass that will break this symmetry and activate the d-scalar mode. In particular, 
we can already anticipate that for M 2 < 0 the theory will be unstable [86,87]. 

Here we will focus on the d-scalar sector of the theory only, since this is where the exotic 
features lie, and we will neglect the source for simplicity. In defining and using harmonic 
variables we must now pay attention to the fact that the position of the spatial indices matters, 
i.e. they are displaced using gij, so for instance 

A = g lj didj = a~ 2 didi , (2.4.68) 

The definitions of the harmonic variables are the same, except for the spatial sectors whose 
natural generalization is 


hij — ^ gij 0 T ( didj } g-ij A J A 


d- 


Tij = g^ p + Udj - 1 gij^j a. 


(2.4.69) 

(2.4.70) 
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Note that this changes only the definitions of 4> and P- The inverse relation now reads 


X = 

A-^ho i, 

(2.4.71) 

0 = 

hi, 


(2.4.72) 

A = 

1 

A- 1 [h\ - dA 'ird'hij] , 

(2.4.73) 

d- 1 

where P 2 is the same as before 
transformation (2.4.62) reads 

but one 

must use g tJ to displace its indices now. 

The gauge 


5il> = 

-2 A, 

(2.4.74) 


Sx = 

-A-B + 2HB , 

(2.4.75) 


5<t> = 

-2AB + 2dHA, 

(2.4.76) 


SX = 

-2 B, 

(2.4.77) 

so the Bardeen variables are 




T 

= $ - 

(2x-\ + 2FA) , 

(2.4.78) 

<h 

= To - 

- dH (2 X - A + 2H\) , 

(2.4.79) 


where <ho = AA — (j> is the of Minkowski space-time. We see that now both combinations 
include time-derivatives of the original variables so that none of these can be taken as a funda¬ 
mental field since their initial conditions are undetermined. In particular, this seems to imply 
that the d-scalar degree of freedom on the FP point a = 0 will not be gauge-invariant. 

Nevertheless, one must note that the Bardeen variables are the only gauge-invariant combi¬ 
nations (up to combinations among themselves) that are local in time in the harmonic variables. 
This is certainly convenient, although not at all a physical requirement. In fact, as we will see 
in a moment, if we abandon this property we get access to gauge-invariant combinations that 
do not suffer from the above initial condition problem. 

We can now write down the d-scalar part of the action 


-S'scai. = d ^ L f d D xa d 


1 


- * (<I> - dffT) + d 2 di$d l <S> + di^d 1 ^ 


(2.4.80) 


-- m + 2<h 0 AA + 


2d 

d- 1 


(AA - $ 0 ) - 


2d 

d- 1 


diXd l X 


Again, when m = 0we see that only gauge-invariant combinations appear and we retrieve the 
flat space-time result for H —> 0. For H ^ 0, the second and third terms of the first line can 
be rewritten in a convenient way 

4 > - dH^j = 


- — / d D xa d d i <S>d i ( 


I D xa d 


d-2 

2d 


d l <5>d t <S> + 


(2.4.81) 


where we have used the fact that 

di$ d l § = * a~ 2 d t {di§) 2 , 


(2.4.82) 
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and then integrated by parts the time derivative. Observe also that 


K = <f> - dAT = $o - dHip , 


(2.4.83) 


so that this combination actually only depends on 4>o and ip. We can then consider ip, y, 4>o, A 
as our independent variables and integrate by parts here and there to finally get 


5"scal. — 


d- 1 


d D xa d 


- 2dH x)d i K 

-AA ( A + dHK + m 2 <5 > 0 + ip 

d — 1 


+ -m 2 ( $5 + 


2d 
d- 1 


+ 


2d 

d- 1 




(2.4.84) 


We start by integrating-out x 


*S"scal. 


d- 1 


d D rca d 


—-A 2 - 


d — 1 
dm 2 


A AAA' - 


1 


-AA A' + dHK + m 2 $ 0 + 


dA 
dm 2 
d^l 


9, ( I>o AA 


*l> + - w 2 $5 + 


2 d 
d- 1 


We now rescale our fields 


{Yb ^ 0 ) A} —> {^,$ 0 ,A} , 

and trade ^ for the new variable 


// / d - 1 , 

ip = ip ^ -— 4 > 0 , 


so that 


and get 


K -> o 


-(rf- 1 ) _ dH ^ = a -(d- 1) K > ? 


d’oV’ 

(2.4.85) 

(2.4.86) 

(2.4.87) 


'S'scal. — 


d- 1 


d D xa 2) 


--K' 2 — 


d- 1 


1 


dm^ — — “ dH * 0 ' 


a a 7 a a' - -a- a<ho a a' 


dm 2 
d - 1 


-AA f A' + AA' + ipf ) + J m 2 ( —4>g + 


We can now trade $o for the new variable 

n(t) = $ 0 (t) - dH f d t' ip'(t') +^ d ~ ^ H 

Jti 


M 2 


$ 0 {U) - dHiP'(U) + H<S>o(U) 


(2.4.88) 


where t t is the time at which the initial conditions are given and we have omitted the x 
dependence for notational simplicity. The choice of the time-independent term will be justified 
later. As already suggested above, this is a non-local generalization of $o to de-Sitter space- 
time, for which the Cauchy problem is well-defined. Indeed, it is gauge invariant, as we will 
show in a moment, and the initial data {f 2 (A), ip'(ti)} are in bijection with {4>o (U), ip f (ti)} 


. m 2 . (d — 1 )A 

= M2 jp- 


&o(U) ~ dHip'pti) 


(2.4.89) 
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Cl(ti) = 4>o (ti) - dHip'^ti), 


(2.4.90) 


contrary to the Bardeen variable <f>. We can then invert this to get 


$o (t) = 12(f) + dH 



(d-l)H 


m 2 


12o(U) + 7712 (fj) , 


(2.4.91) 


so, although K’ = 12, performing this replacement in the action will yield non-local terms 
because of the presence of undotted 4>o’s. However, after integrating-out A and solving for ip 1 , 
we get that the latter becomes a total time derivative 

V’' = -^(S! + ™) , (2.4.92) 

so that (2.4.91) becomes local 


Mt) = K n (*) - {d l )H «(*) ■ (2.4.93) 

m- 

We now understand that the time-independent piece in (2.4.88) was chosen precisely such that 
it cancels the one arising in the above integral. We are thus left with an action for 12 alone 
which, after many integrations by parts, gives 


S,C„. = d -^K [ d D xa-«-V 
a m- ./ 


--dMd^n- - m 2 12 2 


(2.4.94) 


If we rescale back in order to obtain the volume form \J—g = a d for the integration measure 


12 —> a d x 12 , 


(2.4.95) 


we get 


‘S'scal. 


d—1 M 2 
d m 2 


I d D xV^ 


--djidt’n - -m 2 n 2 


(2.4.96) 


so this has also the effect of replacing M 2 with m 2 in the mass term. The dynamical mode in 
the d-scalar sector is therefore 12, which in terms of the original fields reads 


12(2) = 4>o (t) — dHa ( d ^(t) dt'a d 1 (f / ) ^ ^ 4>o (t'] 


(2.4.97) 


+ 


a(U) 


d -1 


(d- 1)77 


a(t) ) M 2 

f dt' a d - 1 {t')K(t') + 


^o{U)-dH^{U) + H^ Q {U) 


a(tj) 

a(t) 


d—1 r 2 
m 


. . (d- 1)77 

M 2 Mu)+ M 2 k (u 


Under a gauge transformation (2.4.62) we have that K is gauge-invariant so 


512(f) ~ ~ A(U ), (2.4.98) 

and thus 12 is gauge-invariant if we set A(tj) = 0. Note that this restriction is by no means a 
loss of symmetry since it concerns only a subset of measure zero of the gauge parameters. One 
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can still use such an A(t) to trivialize the time-evolution of a field mode. Thus, after having 
integrated-out the non-dynamical fields, the d-scalar sector remains gauge-invariant even in 
de-Sitter space. 

A very elegant feature of our result (2.4.96) is that it renders the dependence of the spectrum 
on M quite transparent. 17 becomes non-dynamical when M —> 0, in which case we reach the 
partially massless theory with gauge symmetry (2.4.63) 15 , while for M 2 < 0 that mode becomes 
a ghost. The stability condition M 2 > 0 is known as the “Higuchi bound” [87]. We also see 
that the “natural variables” with respect to the interpretation of M 2 as being the mass of the 
partially massless theory are the rescaled ones, since it is for these fields that M appears as 
the mass (2.4.94) and for which 17 involves no a in its definition (2.4.88). 


2.5 Propagator 

The dynamical content and stability of a linear theory can also be deduced by looking at its 
propagator. Moreover, since the propagator is an essential building block of perturbative QFT, 
it is important to be able to “read from it” this important information of the theory. We will 
not write explicitly the e prescription here since it depends on whether one is interested in 
classical or quantum propagation. It will be however useful to use some QFT language, e.g. 
the number of dynamical fields becomes the number of particle polarizations/states. 


2.5.1 Spin 1 

Writing the Proca action (2.2.1) in the form 


5 


d D x 


\ A^ V A V 



we can identify the quadratic structure 

V lv = t f v (□ - m 2 ) - &‘d v . 


(2.5.1) 


(2.5.2) 


The propagator is defined by 

W p D pv = , (2.5.3) 

whose solution is 

Dliu{k) = ~ k 2 + m? { rhw + ' (2 ’ 5 ' 4) 

The exchange of a photon between two vertices in the computation of a scattering amplitude 
will then be controlled by the saturated propagator 


r p (k)D pu (k)r(k)=j* p (k) 


k 2 + 


m 


2 


r(k), 


(2.5.5) 


where here j p , j t,J ‘ either represent external on-shell sources, in which case conservation implies 
= kjjj'^ik) = 0, or parts of a Feynman diagram to which the photon is attached, in 

15 Since the mapping between Q and 4>o is singular as M —¥ 0, we should actually check this result by working 
directly on the M = 0 point, in which case integrating out A to fix t/j' gives S' aC ai. = 0. 
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which case it is the Ward identity 10 which implies these equations. In the classical case, the 
saturated propagator is what controls the interaction mediated by the electromagnetic field in 
the perturbative equations of motion of the fields present in the source. 

In the massive case we have as many possible inversions of K,^ v as with □ — m 2 because 
of the homogeneous solutions. These are parametrized by all the possible linear superposition 
amplitudes Oj(fe) (belonging to some space of integrable functions), that are functions on M rf . 
Going to the m = 0 case enlarges that kernel dramatically because now it also includes all the 
pure-gauge solutions A fl = d^O, parametrized by a function 9 on M> D . Thus, on top of the pole 
contour prescription, which can be translated into a prescription on initial/final conditions, one 
must also give a prescription for picking a preferred gauge, i.e. one must add a gauge-fixing 
term. The usual Lorentz-invariant choice is 


S * = ~hJ d ° x (^ M ) 2 ’ 


(2.5.6) 


so that 


is invertible and 1. 


lO * 11 ' = rf v U - ^1 - 0 , 

D^{k) = -p (^r], LV - (1 - 0 -p-^ • 


(2.5.7) 


(2.5.8) 


What matters for the gauge-fixing term to be valid is that the saturated propagator must be 
independent of it because the physics cannot depend on a choice of gauge. Since k^ l j fJ, (k) = 0, 
which is also a consequence of gauge symmetry when m = 0, we have indeed the ^-independent 
result 


f^D^rik) 


f^k) 


k 2 


rinv 


r(k) 


(2.5.9) 


Comparing with (2.5.5) we note that the interaction between two sources mediated by the 
photon is continuous in the m —> 0 limit. At the same time however, we know that the massive 
photon has d polarizations, while the massless one has d — 1 polarizations. To understand how 
a discontinuity in this number can be consistent with a continuous limit at the propagator 
level, we decompose j 11 and j ,>x harmonically (2.4.2) which in Fourier space gives 


Jo = ~P, 3i(k) = iki(j(k) + Gi(k ), ha^k) = 0 , (2.5.10) 

and similarly for j ,/i . We then restrict to tree-level diagrams and sources with “mass” m 2 = — k 2 , 
so that m s is the “mass” of the virtual photon that is being exchanged. For instance, in the case 
where the source is made of minimally coupled electrons and positrons we have that m s > 2 m e . 

10 The Ward identity is usually presented as a direct consequence of gauge symmetry and it can therefore 

appear as a surprise that it still holds in the massive case. However, note that one can also derive the identity 
by simply using the operator equation = 0, which is valid in the massive case, when computing correlation 

functions with on-shell external momenta d M (0| T |A m (ie) ... j 1 0) =0. Thus, the Ward identity still holds in 
massive electrodynamics, not because contains no propagating degrees of freedom as in the massless case, 

but because d^A 11 is simply zero on-shell. 

11 For non-linear theories the gauge-fixing term breaks the unitarity of the S-matrix and one must also include 
Faddeev-Popov fields to restore it. 
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We also consider the case ?n s > m so that we do not have to deal with the complications of 
resonances 18 . We can then write the conservation equation (2.4.36) as 


ik 2 



and similarly for j' 11 , so that the saturated propagator reads 


(2.5.11) 


j^{k)D^k)nk )\ k2= _ m2 = 



m z + k z 


(2.5.12) 


The first term in the square bracket represents the exchange of the longitudinal photons between 
the longitudinal modes of the sources, while the second term corresponds to the exchange of 
a transverse photon between the transverse modes of the sources. We can now focus on the 
case where m 2 —> m 2 from above, so that the photon gets close to being real. It can therefore 
be considered as an external photon that has been “produced” by j ,fl at t —> — oo and then 
“detected” by its interaction with at t —> oo. 

This shows how the continuity in the saturated propagator can be reconciled with the 
discontinuity in the dynamical fields of the photon: the longitudinal information is simply 
proportional to m 2 for real photons and thus smoothly decouples in the massless limit. It is 
therefore not enough to look at the unsaturated propagators to deduce the number of dynamical 
fields in the theory, one must also make use of the conservation equation of the source, which 
brings in the mass dependence. 

Note that the source components that appear are the ones that are being propagated so 
that counting them gives us a lower bound on the number of dynamical fields N<±. In the 
massive case we have cq and a, that is N& > d, while in the massless limit the longitudinal part 
c7 smoothly decouples and becomes unobservable and we are thus left with N ( \ > d — 1. Here 
these inequalities are saturated, as we know. We will see however that this is not always the 
case for non-local theories in the presence of ghosts. 

Finally, as far as stability is concerned, we have that the saturated propagator (2.5.12) is 
the one of a healthy scalar times a positive-definite scalar product of j t and j [, so that this 
theory is stable. 


2.5.2 Spin 2 

Let us start by identifying the quadratic structure of (2.2.13) 

K ^pa _ gpvpa _ m 2 _ (J + > (2.5.13) 

ls Demanding heavier sources m s > m and no loops implies that the virtual photon can never be on-shell, i.e. 
it is never a “real” photon. Alternatively, if m s < m, then the process in which the photon is on-shell would be 
kinematically allowed, in which case the propagator would be singular, implying an infinite probability for this 
process to occur. As in the case of any heavy particle, the resolution of this apparent problem comes by noting 
that the heavy particle becomes unstable precisely when m s < m, since it can then disintegrate into the source’s 
particles. By the optical theorem, we then have that the imaginary part of the vacuum polarization diagram 
becomes non-zero. Since that diagram is responsible for shifting the mass m under radiative corrections in the 
propagator, we get that the poles of the renormalized propagator have a non-vanishing imaginary part. Thus, 
the case k 2 = where m Ten is the renormalized mass of the photon, is not a singularity of the renormalized 

propagator but rather the maximum of the so-called “Breit-Wigner” resonance. 


50 









where £ was defined in (2.2.14). The propagator 

IC^D a p pa = iSf/Z), 


(2.5.14) 


is given by 
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pis pa 
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k 2 + m 2 


r, (jlupVva + dpadup ) 1 ( 1 r 
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(2.5.15) 


d d 

In the saturated propagator with conserved sources the terms with uncontracted k^s drop 


rp*piS p) rjifpa _ 

1 Up upa l - - k2 + m2 


rjn* rjpfpis I 2 


L pis 


d 


a k 2 + rrr 


d p 2 




k 2 + m 2 


rp* rp/piS rp* rpf 

^ ~d 


OL % * , 

rj~i% ji/ 


d 2 jjj 2 


(2.5.16) 


Note that this neatly splits into the Fierz-Pauli propagator a = 0 plus an extra scalar propa¬ 
gator which can be written as 




~ k2 + ™ 2 host d(d - 1) ’ 


(2.5.17) 


with m,g host given precisely by (2.4.52). Indeed, this is the pole corresponding to the ghost since 
the “kinetic” part ~ k 2 in the denominator comes with the wrong sign and the corresponding 
source is the trace T. In the case a = —1/2 we have that m 2 host = —m 2 , so that this becomes 
also a tachyon, but then the tensor structure becomes the one of the massless theory 


rp*piS p\ rrilpa _ 

^ pa ~ ~ k 2 + m 2 


rp* rp/piS rp* rpf 

^ 1 


(2.5.18) 


Let us now focus on the case a = 0. As we did for the spin-1 case, we can again perform 
the harmonic decomposition of the sources (2.4.18) and use the conservation equations (2.4.36) 
with “source mass” m 2 = — k 2 


P = 


to get 


ik 2 
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+ k 2 
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(2.5.19) 

(2.5.20) 
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where x = 



and M is a 2 x 2 matrix with eigenvalues 


A_ = 0, 


(d 2 — 2d + 2)k 4 + 2dk 2 ml + d 2 mf 
+ d(d — l)(mf + A: 2 ) 2 


(2.5.21) 


so that there is only one pole corresponding to the d-scalar sector and with the correct sign, as 
expected. We see that the d-vector and d-tensor sectors are the exact analogues of the d-scalar 
and d-vector sectors of electrodynamics (2.5.12). Considering the limits m 2 —> m? —> 0 we 
get that the a % part smoothly decouples and we are left with only oy,-. In the d-scalar sector 
however we have the vDVZ discontinuity since 


A + -»■ 


d 2 - 2d + 2 
d(d — 1) 


/ 0, 


(2.5.22) 


so this pole remains. We can compare this with the case m = 0 where, because of the gauge 
symmetry, we must add a gauge-fixing term in the action in order to invert the quadratic 
structure. The usual Lorentz-invariant choice is 


5 gf = -- J d D xd li h^d p h p „ , 


where h^ has been defined in (2.2.37), in which case one has 

lC lwpa = - (l - ^ rrrf a \ □ 

- (l - i) (rfb&*'>&' + rf {p d a) d p - rj pv d p d c - rf’Wd 1 '') 


with inverse 

Dpvpcr 
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k 2 


2 (Wp.pVi'cr + Vp-crVi/p) ^ _ j. VpvVpcr 
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kpkp 

m 2 


Comparing the saturated one 


rj~\* pi/ rj-ifper _ _ 


P v P a _|_ m 2 


rjp* rj~\f pi/ rj-\*rj~i/ 


^ pi /- 1 


d- 1 


" * / 
m 2 - m 2 ij ij 


(2.5.23) 


(2.5.24) 


(2.5.25) 


(2.5.26) 


with (2.5.16) for a = 0, we see that the discontinuity lies in the factor in front of the ~ j'* r p l 
term which is 1/d instead of 1 / (d — 1). This difference is what is precisely needed in order to 
cancel the d-scalar pole. Finally, here too we can see that the m —>• 0 and a —> 0 limits do not 
commute. Indeed, taking m —> 0 while keeping a/0 fixed we get that 


a k 2 + m 2 d 

d ji 2 d — 1 ’ 


(2.5.27) 


so (2.5.16) becomes the massless propagator, which is independent of a, and there is thus no 
vDVZ discontinuity. 
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2.6 Stiickelberg formalism 


In using massive theories so far we have encountered two conceptually disturbing features. First, 
the gauge symmetry is broken and, second, the number of degrees of freedom is discontinuous 
in the m —> 0 limit as it suddenly jumps from 2d to 2 (d — 1). The Stiickelberg trick [31,32,88] 
is an elegant way of killing those two birds with one stone at the level of the action, and with 
explicit Lorentz covariance. As in the case of propagators, it shows that the degrees of freedom 
do not change discontinuously as m —> 0, but that some of them simply decouple. 


2.6.1 Spin 1 

The so-called “Stiickelberg trick” amounts to introducing auxiliary fields in a way which is 
patterned on the gauge transformation itself. In the case of massive electrodynamics we have 
(2.2.2) so one substitutes 

A^-d^, ( 2 . 6 . 1 ) 

m 

in (2.2.1), where 0 is the “Stiickelberg field”. Since this technically has the form of a gauge 
transformation, only the mass term varies and we have that the Proca action becomes 


S[A)->S[A,<I>] 






1 

2 


m 2 A^A^ 


- m A^d^ + A^f . 

( 2 . 6 . 2 ) 


By construction, this action is invariant under the gauge transformation 


5An = —d^d , Scf) = m6 , (2.6.3) 

so 4> is a redundant (pure-gauge) field. The equations of motion of A M and 4> are, respectively, 

d/iF 1 ' 1 ' — rn 2 A v = — f + m cP</>, Ocj) = —md^A^ , (2.6.4) 

and we see that the latter is nothing but the divergence of the former. The gauge in which 
<f) = 0 is called the “unitary gauge”, in which case one recovers the equations of Proca theory. 
However, the advantage of having <p around is to keep imposing the gauge condition on the 
gauge field, and by choosing this condition appropriately, (j> can then be interpreted as carrying 
the information of the longitudinal degrees of freedom that are activated in the massive theory. 
To see this, let us proceed to two different gauge-fixing scenarios. 

We first choose to impose the Lorentz gauge d^A^ = 0 so that, along with the equation of 
motion of Aq, we can fix the initial conditions of the latter 

(A - rrr) A 0 = d,A t + m(p - j 0 , A 0 = diAi. (2.6.5) 

We are then left with the equations 

(□ — m 2 ) Ai = —ji + mdi(j) , □</> = 0 . (2.6.6) 

Now we see that, as in the massless case in (2.2.1), we also have a residual gauge symmetry 
given by the 6 obeying □<? = 0. However, since the Ai obey a massive Klein-Gordon equation, 
we cannot use such a 8 to kill the homogeneous solution of d\Ai as in the massless case. Rather, 
we can use 9 to set cf> = 0, so that this amounts to choosing the unitary gauge. Thus, with 
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the Lorentz gauge the Stiickelberg field cannot represent the longitudinal mode since it obeys 
a massless Klein-Gordon equation. 

Another initial choice of gauge is d p A p = —mcj), in which case the conditions on read 
(A - m 2 ) A 0 = diAi + m<j)- j 0 , A 0 = diAi + mcf) , (2.6.7) 

and the leftover equations are 

(□ - m 2 ) Ai = -ji , (□ - m 2 ) cj) = 0 . ( 2 . 6 . 8 ) 

We have again a residual gauge symmetry but it is now parametrized by the 8 obeying (□ — 
m 2 ) 8 = 0. We can thus choose either to set <f> = 0 using such a 8, or to eliminate the 
homogeneous solution of diAi as in (2.2.1). In the latter case, it is therefore <f> which survives 
and represents the degrees of freedom associated with the longitudinal part diAi , while A p 
contains 2 (d — 1) degrees of freedom as in the massless case. Thus, the interpretation of 4> 
depends on the choice of gauge one makes. 

Nevertheless, the interpretation in which cj) represents the 2 degrees of freedom of the 
longitudinal part is the most appealing because it survives in the m —> 0 limit. Indeed, for 
m = 0 we have that </> becomes gauge-invariant and thus an unambiguous degree of freedom. 
We are then left with massless electrodynamics plus a scalar, totaling IVf = 2N^ = 2d. The 
important feature is that A p and </> are now decoupled, so if we focus on the dynamics of A p 
then cj) is unobservable. Just as we saw when studying the propagators, the longitudinal modes 
do not propagate in the A p field anymore. 


2.6.2 Spin 2 

In the spin-2 case we must pattern the introduction of the Stiickelberg field on (2.2.15) 


hfiv ^ hfj,u ~\~ {d p A v + di/AfA , 

m 


(2.6.9) 


in (2.2.13) to get 


S 



2 h^ upa h pa - ^ FfwF' 1 " + 2a (d p A p ) 2 ~\m 2 (h pu h^ 
-2m {hF v d p A v - (1 + a)hd p A p ) + h pv T pv \ , 


(1 + a)h 2 ) 

( 2 . 6 . 10 ) 


where as usual F pv = d p A v — d u A p , so that the gauge symmetry is restored 

— d p ^i/ d,^i, , dA p . 


( 2 . 6 . 11 ) 


Note that for a = 0 the equation of motion of A p takes the form of the equation of massless 
electrodynamics with an /r AU/ -dependent source, so it is invariant under the U(l) transformation 
(2.2.2). This means that A p represents 2(d— 1) degrees of freedom, while the difference between 
Fierz-Pauli theory and the massless theory is D 2 — D — 2 — ( d 2 — d — 2) = 2d, so if we take the 
m —> 0 limit now we are still discontinuous in the number of degrees of freedom. We can thus 
perform a second Stiickelberg trick on this field in order to acquire the U(l) symmetry as well. 
We replace 

A lt ^ A p ^ 1 d p d , ( 2 . 6 . 12 ) 
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to get 


S 



\ K^ vpa K° ~ \ F ^ FPU + 2 « (^f + ^ (D^) 2 

m 2 (h^h^ - (1 + a)h 2 ) - 2m (h^ u d^A v - (1 + a)/n9 M A M ) 

-2 (h^d^ - (1 + a)hD<t>) + — d^U</> + V^H , 


(2.6.13) 


which has the gauge symmetry 


Sh,j, u = 0 , 6 A n = -d^O , 6 <\> = md . 


(2.6.14) 


For a/Owe see that we have a higher derivative theory for <j) which means that it carries a 
healthy and a ghost-like degree of freedom. Indeed, one can integrate-in a second scalar if: to 
lower the derivative order by replacing 19 

d^d^cj) + ^ m 2 if) 2 j , (2.6.15) 


2Q 2 ->■ -2a ( 

m z \ 


and then diagonalize the (j),ip kinetic sector to find there is a ghost 20 . 

As for the limit m —> 0, the cases a = 0 and a / 0 must be considered separately as always. 
In the former case we have that A^ decouples, while we still have terms ~ dhdcf). We must 
thus diagonalize the h^ u and 4> kinetic sectors by redefining 

h',w = - d 2 _ 1 ri IJV cf ), (2.6.16) 


to get 

1 - 1 F^F"- - jA i,T . 

We see that although A M has totally decoupled, the scalar 4> remains coupled to the source and 
is gauge-invariant under (2.6.14). Thus, (f> still interacts with the system and this is the way 
the vDVZ discontinuity manifests itself in this formalism. For the a^O case, there is no U(l) 
gauge symmetry in the equation of A^ to begin with, so the latter already represents the 2d 
degrees of freedom that are activated by the mass. We therefore do not need to introduce the 
Stiickelberg scalar and can take the m —> 0 limit at the level of the h^ v ,A^ action (2.6.10), to 
get that A^ decouples, leaving us with the massless theory for h fiu . 



2.7 Non-local formulation 

Another advantage of the Stiickelberg formalism is that it can serve as an intuitive starting 
point for constructing non-local gauge theories. Here we follow closely the procedure introduced 
in [29,69] and also used in our paper [68]. 

19 The original action is then obtained by integrating-out ip. 

20 This is why any other kinetic term than F I1 ^F 111 ' for a vector field implies a ghost by the way. 
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2.7.1 Spin 1 

Let us start by solving in a causal way the equation of motion of the Stiickelberg field <j> (2.6.4) 

(j> = </> hom - mD~ 1 df l A ,i , (2.7.1) 

where 0 hom is a homogeneous solution = 0 and d^A^ must have finite past for this 

equation to make sense. For notational simplicity, unless specified otherwise, from now on we 
will only write “D -1 ” to denote the retarded inversion of □. 

Since we know that d^A^ 1 is not physical, demanding that it has finite past is not too much 
of a restriction. It would have been way more dramatic if we imposed this condition on all 
of A^, because this would exclude free wave-packet solutions since these extend arbitrarily far 
into the past. We can now proceed and plug (2.7.1) inside the equation for A p to get 

d„Fi n ' - m 2 V\A^ = -j w , (2.7.2) 

where we have a new conserved source 

3 il =f-m <9^ hom , dtf* = 0, (2.7.3) 

and we have defined the operator 

v; = 8 i- duU-'er = ^ , (2.7.4) 

which has the following nice properties. It is a projector 

v li p v p v = 8 "- 2d fl a~ 1 d u + d li n- 1 nn- 1 &' = v *, (2.7.5) 

where we have used the fact that D _1 is a right inverse of □, the projected field A J A u 

is ZD-transverse 21 

d^Aj t = d^A^ - nn~ 1 d v A v = 0, (2.7.6) 

and, under a gauge transformation (2.2.2) where the gauge parameter 8 has finite past, varies 
as 

SAj, = -8^8 + dJOT'nO = 0 . (2.7.7) 

Indeed, since D^ 1 acts on □#, it only makes sense for with finite past, which implies that 8 
has finite past and also that □ _1 D = id. This condition on the gauge parameter is reminiscent 
of the condition we encountered on the initial conditions of the gauge parameter on de-Sitter 
space-time. Again, this does not exclude the possibility of using 8 to neutralize a field mode, so 
it does not diminish the gauge symmetry in any sense. We thus have that A ^ is gauge-invariant 
for all practical purposes. 

Going back at (2.7.2) we see that we have reached a gauge invariant description of massive 
electrodynamics with no extra field, but at the price of non-locality. This may a priori sound 
a bit surprising because we know that this non-local theory is equivalent to a local one. This 
means that the physics of (2.7.2) cannot be non-local, i.e. the prediction of the value of some 
physical observable at x should still only depend on the data in its infinitesimal past light-cone 
neighbourhood. This is indeed the case because by going to the Lorentz gauge d^A^ = 0 the 
equations become local. Thus, non-locality is only an artefact of explicit gauge-invariance and 
actually affects only the pure-gauge modes. The mass term can therefore be understood as the 
obstruction to having simultaneously both manifest locality and gauge-invariance. 

21 This is not a surprise since the right-hand side of (2.7.2) is transverse. 
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Where are the degrees of freedom? 

Let us now try to count the degrees of freedom using (2.7.2). We choose the Lorentz gauge 
= 0 so that we retrieve the equation of motion of Proca theory (2.2.5), but with j' iL 
instead of j 1 ' 1 . i.e. we have the homogeneous solution of 0 that is still around. This amounts 
to as many different sources as <0 has initial data, so we might be worried that our non-local 
trick might have inserted additional degrees of freedom into the system. Of course there is no 
miracle, and 0 hom is eliminated by the residual gauge symmetry one has in the Stiickelberg 
formalism. Indeed, the equations being 

(□ - m 2 ) = -jp + m <9 M 0 hom , = 0 , (2.7.8) 

we can transform with 0 such that □# = 0 to get 

(□ - m 2 ) + m 2 d fl d = -jn + m , d fl A> J ' = 0 . (2.7.9) 

Since D<)) hom = 0 as well, we can choose 6 = m _1 0 hom and retrieve Proca theory exactly. 
Indeed, remember from section 2.6.1 that in the d^A^ = 0 gauge, 0 cannot represent the 
longitudinal mode because it is massless D0 = 0, so fully gauge-fixing can only result in the 
unitary gauge 0 = 0. This shows us that we could have avoided keeping track of 0 hom in the 
above computations since at the end of the day this “freedom” is pure-gauge. In the Stiickleberg 
formalism if we set = 0, then we still have a residual gauge-symmetry. In the non-local 

formalism with 0 hom = 0 if we set d^A^ = 0 we have the Proca equations and thus no residual 
gauge symmetry. 

Nevertheless, we also saw in section 2.6.1 that if we rather choose the gauge d^A^ = — m0, 
then 0 obeys (□ — m 2 ) 0 = 0, so its homogeneous solution could be interpreted as carrying 
the longitudinal degrees of freedom of the theory. However, here if 0 were to carry the plane 
wave solutions of the longitudinal mode, then the gauge choice d^A^ 1 = —mcj> would not be 
admissible because d^A^ would not have finite past. 

We therefore conclude that the Stiickelberg fields cannot represent the mode that is acti¬ 
vated by the mass in this non-local formulation and thus one can safely set 0 hom = 0. From 
now on j = jv and we will also neglect the homogeneous solutions when integrating-out the 
Stiickelberg fields in the spin-2 case. Indeed, there too the homogeneous solutions of the Stiick- 
elberg fields will be massless so that they cannot represent the dynamical fields of the theory. 
They ultimately correspond to the residual gauge freedom of the Stiickelberg formalism. 

Filtered response to linear sources 

The non-local equation of motion (2.7.2), although quite elegant, can be simplified even more 
if we restrict to the case where all of A^ has finite past and thus so does . This is the case 
where one is interested in the production of electromagnetic waves by a source with finite past, 
i.e. when any radiation at future infinity is entirely due to j 1 *. Then, one can write 

Afj = , (2.7.10) 

so that (2.7.2) reads 

(l -^jd, F ^ = -f. (2.7.11) 
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In this particular case, we have access to a new interpretation of the mass term as a high- 
pass filter [29,31,32,69]. Indeed, going to “Fourier space” 22 and neglecting the pole contour 
prescription, we have 

- (l + r 'p\ik^ = -f, (2.7.12) 

which can be inverted to give 

? k^ 1 

k , F,w = -(2-7.13) 

m z + k z 

Now the left-hand side is the kinetic term of ordinary massless electrodynamics, but the source 
is multiplied by a filter which modulates its intensity. Indeed, for k 2 <C m 2 , i.e. for high 
frequencies and large wave-lengths, the source of AAk) becomes ~ k 2 . This is the degravitation 
analogue for electrodynamics, which “screens” the large scale behaviour of the source [29,69]. 

It is important to stress one more time that equation (2.7.11) is valid only when studying 
the response to an external source. More precisely, (2.7.11) only makes sense if d p F^ v has 
finite past, which excludes ingoing radiation at past infinity since that radiation does not obey 
d/jF 111 ' = 0 because of the mass. Therefore, (2.7.11) cannot be taken as a classical model 
covering every feature of massive electrodynamics. For a full description of the theory, with 
the constraint of past infinity applying only on non-dynamical fields (here one needs to 

consider (2.7.2). 

Propagators using projectors 

The computation of the propagator in a massive but yet gauge-invariant setting is very instruc¬ 
tive, especially in the light of this projector formalism. We can first rewrite (2.7.11) as 

(□ - m 2 ) V»A v = -j M , (2.7.14) 

so that the operator which must be inverted is 

KF V = (□ - m 2 ) . (2.7.15) 

As in the massless case, the gauge invariance of the equation is reflected in the fact that /C 
is proportional to a projector. It gives zero on pure-gauge modes, which means a non-trivial 
kernel, which means that it is not uniquely invertible. In section 2.5 we have used the standard 
method for inverting such operators, which is to introduce a gauge-fixing term that will not 
affect the saturated propagator. In the spirit of the projector formalism developed here, there 
is actually a natural way of privileging an inverse that is also easily computable. Indeed, we 
can note that the space in which 1C lives is the space of transverse operators and that V" 
is the identity element. Thus, as long as we restrict to this subspace, the inversion relation 
becomes 

lC IIp D pv = iVZ , (2.7.16) 

and admits a unique transverse inverse (up to the homogeneous solution/initial conditions 
ambiguity) 

Dvv = ~k 2 + m 2 ( r//w " ' (2 ' 7 ' 17) 

22 This is actually not really possible for the time coordinate since A^ will in general not vanish at future 
infinity because of the waves generated by the source. One should rather use a Laplace transform for t since the 
support of is bounded in the past. 
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Not surprisingly, in the massless case this corresponds to the Landau gauge £ = 0 in (2.5.8). 
This is the only choice that cannot be expressed through a gauge fixing term (2.5.6) precisely 
because it is the only choice which imposes transversality d p A p = 0, instead of breaking it. In 
any case, as already noted, since the source is conserved the physically relevant term is the one 
with no uncontracted k p s. In the spin-2 case however, there will be a whole one-parameter 
family of transverse operators, so this construction will be very useful. 


2.7.2 Spin 2 

The equations of motion of (2.6.13) are 

£ P vpah^ m (/q w (IT — 

d p F piJ - 2afrd IJ A 11 = 

a0 2 (f> = 

for h pv , A p and <j), respectively, and we find c< 

f = ■ 


—Tfu, T 2m (d^A v) - (1 + a)rj^d p A p ) 


+2 {dnd v (j) - (IT a)r/^D0) , 

(2.7.18) 

0 r\ 

—mj v -\ - (Fn<f >, 

m 

(2.7.19) 

2 

777 

~Y d ^ ~ cmind fl A p ' , 

(2.7.20) 

.venient to define the quantity 


(1 T a)d u h. 

(2.7.21) 


Again, note that each one of these equations is the divergence of the previous one. For a / 0, 
we can solve for (j) 

2 

0 = ^ - m\D~ 1 d fl A p ’, (2.7.22) 

where, as anticipated in the spin-1 case, the homogeneous solution d 2 ^ 110111 = 0 can be safely set 
to zero since it cannot represent a massive mode and is thus ultimately pure-(residual)gauge. 
Remember that this expression for (j) makes sense only if d p j p and d p A p have finite past. 
Plugging this inside the equation of A p we get 

d p F pv = -mV\f , (2.7.23) 

where every term is independently transverse. Now this equation is gauge-invariant so we must 
fix the gauge in order to solve it. We choose d^A^ = 0, invert □ and then add a pure-gauge 
term to get the general solution. This gives, setting again to zero any homogeneous solution, 

A p = -mU~ x V”j v T d p 9 . (2.7.24) 

To perform this inversion we now also need j p to have finite past, not just its divergence. This 
is again ok because j p does not represent dynamical fields since it is actually zero in the original 
formulation (2.2.17). Plugging the solution of A p in the one of 4> we get 

2 

cf ) = r ^-n- 2 d p f -m6. (2.7.25) 

where we have used the fact that 9 has finite past since d p A p = □(? has finite past. Now that 
both A p and 0 are expressed in terms of h pv we can plug them in the equation of motion of 
the latter to get 

£^u P ah pa - m 2 a V pupa h pr7 = -T pv , (2.7.26) 
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where 


-p P a 

* ' LiU 


&1/Z) - (1 + a) r]^V pa - 

2d^U~ l d v) - 


1 + a n —1 


+(1 + a) r) pa 
1 + 2 a 


a 

1 + 2a 
a 


□ - 1 n) - fydvp-'dr + Sldvp-'d?) 

1 H - (X 


d^duU 


-i 


1 9 p 5, 


a 


+ - 


a 


d^duU-^dP 


(2.7.27) 


Although we have expressed this such that D” 1 acts separately on Oh and d p d v h pv , this 
requires only that j p has finite past to converge. Given the complexity of this structure, here 
we will directly focus on the case where all of h pu , and thus T pu , has finite past, so that we can 
commute all these operators at will. The result is then very elegant since it can be expressed 
in terms of the vector projectors 


V pa = V p V a s + 

*' M" ' (/+ v) ' 


1 + Oi 


a 


V, u V pa , 


(2.7.28) 


As anticipated earlier, here we have that a V is a one-parameter family of operators making the 
tensor on which they act transverse 


d ,J 'aV^h pa = 0 . (2.7.29) 

and also gauge-invariant under (2.2.15) for with finite past. It is convenient to switch to 
another parametrization, namely 

a = l + d (1 + 1/a) , a =--, (2.7.30) 

a — a — 1 

and define 

a V P Z = 0 V p : + a s v p : (2.7.31) 

= V) ~ l ~^r ~ o +<W P ) 

+ ^ (v^dP + rTdM +(l- d p d u d p d° , (2.7.32) 

where 

oPfZ = K’ v ») - \ VpvVPa ’ sV Z" = 71 v ^ vpa ■ (2 - 7 - 33) 

To avoid confusing S V with a V where a = s, let us stress that the letter “s” will be exclusively 
used in order to denote the second operator in (2.7.33). Now observe that qV and S V are 
orthogonal projectors 


oV 2 = 0 V, sV 2 = s V, 0 V s V = 0, (2.7.34) 

on the subspaces of transverse-traceless and transverse-pure-trace tensors, respectively. Indeed, 
o 'P^ pLpa h prj = 0 so the latter is also invariant under linearized local conformal transformations 

8h pu = r hw e , (2.7.35) 
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for 6 with finite past. The obvious advantage of the a parametrization is that now the linear 
combination and product of two such operators follow the simple rules 

OL aP + P bP = ( a + fi) aa+£bV , a P ~ bP = { a ~ b) S V , aP bP = abP 1 (2.7.36) 

CK + /3 

so a V is not a projector unless a = 0 or 1. In the latter case, we have the projector on the 
subspace of transverse tensors iP = oP + S P■ Thus, iV, oP and S P are the identity elements 
of the space on which they project. 

In terms of a the choice a = 0 corresponds to a = —d/{d + 1), which is the value for 
which the mass of the ghost (2.4.52) vanishes. Indeed, since the ghost is the trace h, it is 
consistent that the mass term in that case is traceless. Interestingly enough, the projector 
a = 1 corresponds to the value a = — 1. From now on, every time we assign a numerical value 
to the argument of V it will be with respect to the “a” parametrization (2.7.32). 

Now note that the Lichnerowicz operator (2.2.14) takes the form £ = □ \-dP, which cor¬ 
responds to a = —1/2. Indeed, this is the only V that has no ~ term, so it is the only 
case where □'P is a local second-order transverse operator. Therefore, in the case a = —1/2, 
we can rewrite the equation in a compact form analogous to (2.7.11) 

(l - S^ P ah pa = -T^ , a = ~ , (2.7.37) 

which is the result found in [29,68,69], 23 . Not surprisingly, for this value of a we also have 
that, according to (2.4.52), 

™ghost = ~ m2 , (2.7.38) 

so that the ghost mode is also a tachyon with the same magnitude of mass as the spin-2 modes. 
To understand why this happens, note that the differential operator corresponding to this 
equation is 

^vpa = (□ _ m 2) 1 _ d V> lvp,J . (2.7.39) 

Since it is transverse but not traceless, the appropriate identity for the inversion is 

W°-'D ofi()a = i X V% , (2.7.40) 

and thus, using the product rule (2.7.36) the propagator is trivial to compute 

D/ivpcr = T "9 : 2 1 Pjivper ■ (2.7.41) 

k z + m z i -d 

23 Note that in [29,69] the authors erroneously concluded that this theory propagates only the d-tensor part 
of i.e. it has the same dynamical content as the massless theory, because it has the same tensor structure 

(adding a gauge-fixing term and inverting one hnds that the saturated propagator is indeed (2.5.18)). Their 

argument is that one has precisely integrated-out the Stiickelbergs which correspond to the d-vector and d- 
scalar modes, so that the latter do not appear in this equation. As we have seen, this is not true because the 
Stiickelbergs do not represent the dynamical fields that are activated by the mass. Moreover, it is not the tensor 
structure of the propagator alone which determines the dynamical content, otherwise the latter would be the 
same in massless and massive electrodynamics. As we have also seen, the presence of the mass is important, 
because it will affect the conservation equation of the source in Fourier space. Indeed, as we pointed out in [68], 
by expressing the saturated propagator (2.5.18) in terms of the harmonic variables of the conserved sources, 
we get (2.5.20) with M having both a positive and a negative eigenvalue (the ghost pole). We then have that 
M —> 0 as m s —► m —> 0 so that we have no vDVZ discontinuity, as expected. However, for m ^ 0, all the 
independent components of the source are present and thus so are all the dynamical fields of the local theory. 
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We see that, because £ ~ V, all the poles are at k 2 = — m 2 , with the ghost mode having the 
wrong overall sign, but the same magnitude for the mass. Conversely, this is why the rest of 
the a/0 cases cannot be written as (□ — m 2 ) a V for some a, because the mass of the ghost is 
not m any more. 

To conclude the a / 0 case (2.7.26), note that in the m —> 0 limit we are left with the 
massless theory. Thus, as expected, there is no discontinuity. Moreover, as in the spin-1 case, 
the non-locality is “pure-gauge” since one can fix the gauge 

du. {h pv - (1 + a)rTh) = 0 , ( 2 . 7 . 42 ) 

which remember is possible for a / 0, to get 

Vp, vpa h p(T = h^ - (1 + , ( 2 . 7 . 43 ) 

and thus the local equation we started with. 


Fierz-Pauli point 

We now pass to the Fierz-Pauli case. We can first observe that the value a = 0 corresponds to 
a diverging a so that the V operators are not well defined in this limit. However, one should 
note that now the action is linear in 0 and its equation of motion (2.7.20) is 


= d^hr -uh = o, 


(2.7.44) 


to which we will refer as the “scalar equation”. For h^ with finite past this is equivalent to 
S V ■ h = 0, so if the scalar equation holds then a V-h = oV -h and we may still use the projectors. 
Since now d^j p = 0, the equation of motion of A^ (2.7.19) has a transverse right-hand side and 
can be solved as before. The result is then plugged inside (2.7.18) and 0 simply redefines (f) 
again. In order to determine the latter, we can then take the trace of that equation and isolate 
(j), to get 


cj) = 



m 2 h H— T 
a 


(2.7.45) 


where we have used d^j p = 0 and have put to zero the homogeneous solution since it is massless. 
Plugging this back inside the equation we get the following system 


£pvp*h pa -m 2 0 V^ P ah p(7 = -TjJ , 
dpdji pu - Dh = 0. 


(2.7.46) 

(2.7.47) 


where now the source has changed and is actually the traceless-transverse part of T^ u 

Tjj = T, u -- d (r h , v T -?&T)= o V P :T pa , (2.7.48) 

thus satisfying 

d p TjJ = 0, T tt = 0. (2.7.49) 

Now note that the scalar equation is just the trace of (2.7.46), so that it is not independent and 
can be dropped. This might appear disturbing because then we are left with the left-hand side 
of the theory a = 0, which is not the Fierz-Pauli one a = 0, and the corresponding propagator 
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thus has an extra ghost pole. However, when we saturate it with we retrieve indeed the 
saturated Fierz-Pauli propagator in terms of T^ v . Thus, in this formulation the modification of 
the source is very relevant. The fact that the Fierz-Pauli theory has one less dynamical field is 
now reflected in the fact that h /ti , “sees”, and thus propagates, one less component of the source. 
Another advantage of this formulation is that now the reason for the vDVZ discontinuity at 
a = 0 is obvious, the source remains T tt as m —> 0. 

Another option, is to keep the scalar equation and use it to have a V • h = qV ■ h and thus 
£ ■ h = Do V ■ h, to finally get the following system 

(i = ~ T ^ T , (2- 7 - 50 ) 

d/jdyh^ -Oh = 0. (2.7.51) 


The first equation is precisely what we have found for the a = —1/2 case (2.7.37), but now 
it is the additional scalar equation which makes the whole difference. It cannot be obtained 
through a gauge transformation and is responsible for killing the ghost. 

Again, since the theory we started with is local, non-locality can only be a pure-gauge effect, 
although this time this may be a bit less obvious to show because the source term is non-local 
as well. This is why the source must be part of the gauge-fixing condition 


d tl hr 


i 

dm 2 


Ud u T. 


(2.7.52) 


Indeed, with this the scalar equation becomes the equation fixing the trace (2.2.26) and, using 
this to express the source non-locality in terms of h, we can arrange the terms to get (2.2.24). 
Eq. (2.2.25) is then found by taking the divergence of (2.2.24) and using (2.2.26). 


Extra gauge symmetry 

Using again that all a V act the same on h fiv , yet another interesting formulation of the Fierz- 
Pauli non-local equations (2.7.51) is 

(□ - m 2 ) 0 V^ pr7 h pa = -T™, (2.7.53) 

d tl dji iw - Dh = 0. (2.7.54) 

The advantage here is that the first equation is invariant under linearized local conformal 
transformations (2.7.35), and consistently traceless on both sides. However, this is not the case 
of the scalar equation. We can thus “lift” Fierz-Pauli theory to a non-local gauge theory with 
one more gauge symmetry 

(□ - m 2 ) oTW h pc = , (2.7.55) 

and now interpret the scalar equation as a gauge condition that is reached using (2.7.35) with 

Q = -i (h - D~ 1 d fl d u h tiU ) . (2.7.56) 

This is a very elegant result because now the ghost mode is also neutralized by a gauge symme¬ 
try. Indeed, in the spin-1 case we had = d because there are D fields, one gauge symmetry 

and no residual symmetry because of the mass. In the spin-2 case we have D 2 fields, D gauge 
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symmetries in general, so that we are left with N<i = D 2 — D, except in the a = 0 case where 
an extra gauge symmetry reduces that number by one. 

Now the differential operator corresponding to (2.7.55) is 

IC pupa = (□ _ m 2) oV ^pa _ (2.7.57) 


Since it is both transverse and traceless, the appropriate identity for the inversion is 

JO* uafl D a p pa = ioV%, (2.7.58) 

and thus, using the product rule (2.7.36) the propagator reads 


D 


pupa — 


k 2 + m 2 


oP, 


pVp(J 


(2.7.59) 


Saturating it, one finds the Fierz-Pauli result, i.e. (2.5.16) with a = 0. This formulation 
provides us with yet another point of view on the vDVZ discontinuity. Indeed, in the massless 
theory we saw that the only projector for which □P is local is the a = 1 — d one. This gives 
~ _j_V for the propagator and the following tensor structure for the saturated one 

l-d 


1 

~ Vp(p r )cr)i; ~ _ Y P a ‘ 


(2.7.60) 


On the other hand, Fierz-Pauli theory, because of the extra gauge symmetry that is needed to 
kill the ghost in the non-local formulation, must have qV as its differential operator, and thus 
the tensor structure for the saturated propagator is 

~ Vp{pVa)u - ^ VpuVpa • (2.7.61) 


2.7.3 New non-local theory 

In the case of electrodynamics, the uniqueness of the projector makes the non-local formulation 
of Proca theory the only stable non-local theory of a massive vector field. In the tensor case, the 
presence of two independent projectors, qV and S V defined in (2.7.33), allows us to construct 
more healthy models than the ones that are obtained from local theories. In particular, as we 
will see in this thesis, one can construct a novel, genuinely non-local linear theory, that includes 
the trace scalar but with no ghost poles in the propagator. This is possible if we also modify 
non-locally the kinetic term, so it will not correspond to simply adding a non-local mass term 
to linearized GR. 

To construct that theory, we take full advantage of the projector formalism developed above 
to write an equation in which the tensor and scalar modes are diagonalized 

(□ - m 2 g ) 0 V^ pa h p(T + (zO - m 2 s ) s V pl , prT h p(J = -T pu , (2.7.62) 

so that each one of them can have its own mass. The z factor will be useful in tracking ghost¬ 
like behaviour. Now since by definition qV + z s V = Z V, the only case in which the kinetic part 
is local, and thus coincides with linearized GR, is 

z = l-d. (2.7.63) 
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To study the stability and particle content of these theories let us compute the corresponding 
propagator. Because of the scalar sector we have that the differential operator 

W vprT = (□ - m 2 g ) o V^ pa + (zD - m 2 s ) s V pupa , (2.7.64) 

is transverse but not traceless, so that the appropriate identity element for the inversion is 

lC^D a0pa = i i V< M ' pa , (2.7.65) 

and the solution is (using the product rule (2.7.36)) 


D P u P a — o oP uvpa , n , o sPuupa • 

k z + m z zk z + m z 

Saturating it with conserved sources we get 


(2.7.66) 


r^^fllS rj-ifp<J _ 


i 

k 2 + m 2 



rj-i/flU 


d J 


1 i 

d zk 2 + m 2 




(2.7.67) 


which is the Fierz-Pauli propagator with mass m g plus a healthy scalar propagator, for z > 0, 
with mass m s /^/\z\. Thus, the first term in (2.7.62) describes the massive SO(d)-tensor modes, 
while the second term describes the massive trace mode. This is a remarkable advantage 
compared to local massive spin-2 theory, where that extra scalar can only be a ghost. In our 
formalism, instead of having to fight to kill that extra mode allowed by the diffeomorphism 
symmetry, we have the opportunity to simply let it participate in the dynamics since we can 
choose 2 freely. Moreover, its mass is also free, instead of being determined by the one of the 
tensor modes. Note also that for m g ^ 0 this is not a scalar-tensor theory, nor a bigravity theory 
in disguise, where the scalar or the second metric would have been integrated-out. Indeed, in 
scalar-tensor theories the graviton is not massive, while in bigravity theories there is also a 
massless graviton. 

We thus have that stability requires 2 > 0, as it could have been expected from (2.7.62). 
This means however that, if we want the kinetic term to be the one of GR (2.7.63), then the 
scalar is a ghost. The exception is when both masses are zero, in which case that mode is 
neutralized by the residual gauge symmetry of linearized GR. Thus, as in Fierz-Pauli theory, 
continuity with GR at mj —> 0 can only be achieved in the presence of a ghost. Conversely, 
any ghost-free massive theory will have a discontinuity, at the linearized level at least. 

This can be easily seen by considering the massless limit m g —> 0 in the saturated propa¬ 
gator. So let us rewrite the latter as 


rp*lllS rj-if pa 


k 2 + 




rri* rjif fJ,lS rri*rji/ 

fllJ 1 


d(d — 1) k 2 + m 2 


ji* rpl _ 


d zk 2 + m 2 




(2.7.68) 


so that the first term reduces to the GR result in the m g —>• 0 limit. We see that we are left with 
the usual vDVZ discontinuity of the Fierz-Pauli propagator, representing the gauge-invariant 
combination of the two d-scalars in h ig , plus the massive scalar mode. Taking also m s —> 0, we 
see that only in the case (2.7.63) does one obtain linearized GR, but then the massive theory 
has a ghost. 
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There is however an important difference with FP theory regarding that discontinuity. Here 
the discontinuity is already visible at the level of the equations of motion (2.7.62), since we 
do not retrieve the massless local equations in the m g ,m s —> 0 limit, for z ^ 1 — d. On the 
other hand, in FP theory the action tends to the massless one in the m —> 0 limit. The reason 
for this difference is the presence of projectors, and thus gauge-invariance. Indeed, thanks to 
the projectors the tensor structure IC pupr7 in the equations of motion (2.7.62) is identical 21 to 
the structure of the propagator (2.7.67). Because of this, any discontinuity in the latter must 
also arise in the former. In FP theory on the other hand, the tensor structure JC pupc7 in the 
action and the one in the propagator D gi/p(T are not at all the same and one can thus have a 
discontinuity in the latter that does not show up in the former. 


Genuine non-locality 


Let us now try to turn (2.7.62) into a system of local equations by fixing the gauge. The choice 
which makes the a V operator local and involves only local operators is 


d n 



1 — a 


D — a 



(2.7.69) 


which is accessible since (1 — a)/(D — a) / 1. For generic masses m g and m s this gauge does 
not make the equation local, whatever the choice of a, so the system is genuinely non-local. 
The only exception is when m 2 = zm g = zm 2 because then (2.7.62) can be expressed in terms 
of a single V operator 

(□ - 77? 2 ) z V^ P ah pa = -T pv , (2.7.70) 

and we can fix the (2.7.69) gauge with a = z to get the local system 

(□ - 777 2 ) rV - l D ~_ Z z rj^hj = -T pv , (2.7.71) 

d ‘ i ( hlu '~]T^ r r h ) = °- (2 - 7 - 72) 

This is reminiscent of the situation in local massive spin-2 equations, because (2.7.72) looks 
like the divergence of (2.7.71). Upon close inspection however, we observe that the analogy 
does not hold because here the divergence of (2.7.71) implies that d^h^ 11 — d v h is a free 
dynamical field, not zero. Because of this, these equations do not derive from the local action 


S 


d D x 


h gi/ (□ - m 2 ) 



1 - ^ 


D-z 



+ h pv T pv 


(2.7.73) 


which describes an obviously unstable theory since it does not have the GR tuning in the kinetic 
sector. Therefore, even in the case of local gauge-fixed equations, the theory does not derive 
from a local action and we thus have genuine non-locality. 

In the case of local theories, the fact that one could localize the equations by fixing the gauge 
was a consequence of the fact that the integrated-out fields where pure-gauge. It therefore seems 
that, if we now wish to localize the above equations by integrating-in some auxiliary fields, the 
latter will not be pure-gauge, so that these theories cannot be obtained by some Stiickelberg-ed 
local theory. This is not a surprise, since we know Proca and Fierz-Pauli theories to be the 
only ghost-free local theories of spin-1 and spin-2 dynamics, respectively. 

24 Up to Klein-Gordon operators. 
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Chapter 3 

Subtleties of non-local field theory 


Now that we have reached the subject of non-local field theory, it is important that we discuss 
some peculiar features that distinguish it from local field theory. This chapter is based on, and 
extends, [68,71,72], 

3.1 Non-local actions 

3.1.1 Schwinger-Keldysh formalism 

The first point is that causal non-local equations of motion cannot derive from the strict 
application of the variational principle on some non-local action. Indeed, say we wish to vary 
an action containing a term of the form 

J d D x(p\3~ 1 -ip = Jd D xd D y4>(x)G T (x,y)ip(y), (3.1.1) 

where “r” denotes the retarded Green’s function. The variation with respect to (j) will provide 
a causal equation of motion 

J d D y G r (x, y)ip(y) = (D^V) 0*0 , (3.1.2) 

but the variation with respect to ^ will involve the “transposed” Green’s function Gj(x, y ) = 
G T (y, x) = Gg,(x, y), which is thus the advanced one 

J d D yG r (y,x)(j)(y) = (D“V) (x), (3.1.3) 

so that this equation is anti-causal. In the case <p = 'ijj, such as in the kinetic terms that would 
correspond to the non-local theories we constructed, one would rather get the term 

J d D y (G r (x, y) + G r (y, x)) 4>(y) = + B~ [ V) (x), (3.1.4) 

i.e. the retarded function is effectively symmetrized inside the action. This is a direct con¬ 
sequence of the time-reversal and time-translational symmetries, i.e. the physics that derives 
from an action is reversible and invariant under time-translations. Conversely, if the equations 
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of motion are non-local but causal, then there is an arrow of time and they can therefore 
not derive from an action. This is why causal non-local equations encompass for example 
dissipative/non-conservative systems [66,67] and systems with memory. Yet another way to 
understand this is by noting that, although one uses initial conditions to evolve the equations, 
the variation of the action is performed by fixing boundary conditions in time. This is clearly 
non-local data and thus the result will in general depend on the whole time-interval, with the 
only exception being for local actions [66]. 

Therefore, non-local equations of motion appear to be of less fundamental significance 
since they cannot derive from an action and thus cannot be understood as the saddle point 
approximation of some path integral. Nevertheless, one should remember that this is actually 
not the rigorous connection between quantum mechanics and classical equations. Rather, the 
equation of motion of a classical field (j> has physical relevance because it can be understood as 
the H —> 0 limit of the equation of motion of some expectation value ( (f>)(t ) = (T|<^(t)|T) of the 
corresponding operator <f>, for some fixed state T. The evolution of (4>)(t) is governed by the 
quantum effective action T and, as it turns out, in interacting theories T is indeed non-local 
because of the non-local nature of quantum corrections [60—64] 1 . So non-locality is not such 
an exotic feature when one is interested in realistic equations of motion deriving from some 
underlying QFT and, as a matter of fact, non-local terms ~ D -1 even dominate in the infra-red. 
So how can these equations be causal? 

The important point is to realize that T is not an action in the usual sense of an integral 
over all of space-time and thus it is a somewhat modified variational principle that allows us to 
extract physically sensible equations of motion. Indeed, the effective action T we are discussing 
here, which we will denote by ‘Ti n _; n ”, should not be confused with the better known quantum 
effective action Ti n _ ou t that is used in the computation of scattering amplitudes and is an action 
of the usual form f* f L(t). In order to clearly distinguish the two, let us first describe Ti n _ out . 
In that case one is interested in 5-matrix elements (ToutlTjn) where the ket is a state at the 
initial time fj and the bra is a state at final time tj. Therefore, the path integral representation 
of this quantity involves the integral of the Lagrangian 

('I'outl'J'in) ~ f ( n d< £(*) ) dtL[Ht )] , (3.1.6) 


over the whole time interval [U,tf]. The quantum effective action Ti n _ ou t[<£>], where ip(t) = 
(Tout | <?i(t) | T; n ), is then the Legendre transform of the generating functional 


ILm—out [<7] 







(3.1.7) 


1 More precisely, in perturbative QFT the propagator ~ ( k 2 -f- m 2 ) 1 corresponds to a non-local operator 
(□ — m 2 ) 1 in real space, so the loop corrections will in general be non-local. For scales k 2 <C m 2 however one 
can expand 


1 _ 1 

k 2 + m 2 m 2 


m 2 


+ d(fc 4 )) , 


( 3 . 1 . 5 ) 


in which case the corresponding real-space corrections are a series of local, but higher-derivative operators. In 
the presence of massless particles however, such as in the case of gravity for example, the propagator becomes 
non-analytic in k 2 around k 2 = 0, so these corrections are non-local at all scales. 
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where J is an external linear source. Although the equations of motion of Fi n _ out provide the 
time-evolution of < p(t ) for J = 0, by construction, ri n _ out is mostly used for its property of 
being the generating functional of 1PI diagrams. Indeed, the equations of motion of ip(t) are 
not very relevant because they are acausal, since the sum over paths will depend on both what 
happens before and after t. Moreover, if one works with vacuum-to-vacuum amplitudes on 
backgrounds with non-trivial evolution, as is in the case of cosmology for instance, then the 
initial vacuum is not proportional to the final vacuum 2 and (O ou t|0|Oi n ) is not even real 3 . Thus, 
this ip usually lacks physical interpretation by not being an eigenvalue of the operator <p and 
intrinsically non-local in its definition. 

In order to get causal equations of motion for some real held one rather needs to consider 
the quantum effective action for an expectation value (< i>)(t) = (Tin |0(t)|T in ), i.e. with both 
the ket and the bra being the same state defined at tj, 4 . Now however the path integral is 
constructed in a different way and we enter the so-called “in-in” or “Schwinger-Keldysh” or 
“closed time-path” formalism [60,61,89-94]. In the scattering case, we had that 

(T 0 ut|<Ki)|T in ) ~ f l d(j)(t) J ^l nt [cj)(tf)](t>(t)^ in [(f)(ti)]e^i dtL ^ ) ] , (3.1.9) 


because one must connect |Ti n ) from fj to <f> at t and then the latter to (T out | at tf. In the 
case of (T; n |<K*)|Tin) we connect |Ti n ) from f j to at t, but then we have to connect the latter 
back to (Tin| at ti, i.e. by going backwards in time. This gives 


(T in |(Xt)|Ti : 


n #+(*) I [ n Tin [<j>+ (**)] (3.1.10) 


u'e[U,t] 




x5 (4>+(t) - 4>-(t)) exp 


i ( d t'L[cj) + (t , )\ + i f d t'L[(j)-{t')] 

Jti Jt 


It is now obvious that the dynamics of ( cf>)(t ) can only depend on the physics in the time-interval 
[U,t\ so that its evolution must be causal. The corresponding quantum effective action Tin-in 
will then be the Legendre transform of the generating functional 


Win—in [J+, J-] = ~ 


iiog / ( n d< ^+( t )) ( n (3.1.H) 

J / \ I'efc *1 / 




x5 (</>+(f) - cj>-{t))ex p 


i / dt' (L[<f> + (t!)] - L[cj)_{t')\ - <j) + J + + 


2 Or the latter is not even known. 

J This is why Fin-out can be used for computing the lowest order quantum corrections to a potential V(<p) on 
flat space-time, because then |0 ou t) ~ |0i n ) and one can restrict to the cases (j> = const where the time-non-locality 
is irrelevant [95]. 

4 As explained in [96], even in the case of scattering amplitudes what is physically observable is not the 
amplitude, but the corresponding probability 


| (T ou t | T in ) | 2 = (Tin| (| T out ) ('I'out | ) |Tin) , 


(3.1.8) 


which also takes the form of an expectation value of some operator. 
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and will thus depend on two fields Ti n _i n , <P-\, the one representing <p on [U,t], going forward 
in time ip+ and the one representing ip on [t, L] . going backwards p-. Concretely, 


r in _ in [^ + , ip-\t}= / d t' (L[(j) + (t')\ - L[(j)-(t')]) + O(h ), 


(3.1.12) 


where L is the fundamental Lagrangian and the quantum corrections will typically mix the two 
sectors precisely because of non-locality. For instance, we may find terms of the form 5 


d t' d t" p + (t')G r (t', t")ip-(t" 


(3.1.13) 


where G r is the retarded Green’s function. Note that p+(t') is indeed causally propagated 
forward in time to p-(t"), since the latter occurs in front of it in this bended time-line. As in 
the scattering case, the variational principle is now a direct consequence of the relation between 
T and W. By construction 


5ip+(t') 


-■ 4 ( 0 , 


5ip-{t') 


■MO 


(3.1.14) 


so for vanishing external source we get that the variation of Ti n _i n is zero. The additional 
requirement here is that one must evaluate these equations at t where the two functions coincide 
by definition <p+{t) = <p~(t) = <£>(i). Since p + is “going forward in time” it will obey a causal 
equation, while since <p- “goes backward in time” it will obey an anti-causal equation. It is 
thus the equation for ip + which is relevant for us, while the one of <p_ is its time-reversed 
copy. Applying this variational principle to the example given above (3.1.13) we get that the 
corresponding term in the action is indeed causal \Z\~ l ip. 

One should also note that the boundary conditions of this variational principle are given 
at the extremities of the time-line, which here correspond to simply t t but for two fields <p±. 
Thus, for the field <p at the end of the application of the variational principle, these are nothing 
but the initial conditions. Therefore, this is a variational principle that relies on fixing initial 
data instead of boundary data. Going back to section 2.1.1, remember that the Feynman 
propagator is the D _1 corresponding to the boundary conditions of the “in-out” path integral 
with (Tin) = |0i n ) and |^ ou t) = |0i n ). It is symmetric (Dp 1 ) 7, = dp 1 and thus privileges no 
time direction, consistent with the fact that the boundary conditions of the path integral are 
defined at both past and future infinity. Here we see that the retarded propagator is the CG 1 of 
the “in-in” path integral for |4/i n ) = |0i n ), where one fixes initial conditions instead of boundary 
conditions and where the arrow of time is explicit. Indeed, for a scalar field in (3.1.11) one must 
insert a iecfr\_ factor in T[</>+] and a — ieq factor in L[</>_] for the path integral to converge. For 
the classical solutions ip, which dominate the path integral, this imposes no ingoing positive 
frequency modes at past infinity, through (p + , and no negative frequency modes at past infinity 
again, through (/>_, so these effectively become the boundary conditions of the retarded Green’s 
function (2.1.5). 

Finally, note that the above construction holds only for theories for which the fundamental 
Lagrangian is local, with the non-localities in T being due to quantum corrections. This is be¬ 
cause in constructing the path integral one must first pass through the canonical formalism and 

5 In general one finds arbitrary powers of different Green’s functions, but always such that the corresponding 
integration kernel is zero when its second argument is outside the past light-cone of its first argument. 
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the latter does not exist in the non-local case precisely because of time non-locality. Neverthe¬ 
less, the “in-in” action and the corresponding variational principle can be taken independently 
of their quantum origin as a well-defined action-based formulation for classical non-local field 
theory. As a matter of fact, such a construction has also been used from the purely classical 
point of view in order to enlarge the scope of action-based mechanics to include dissipative sys¬ 
tems as well [66,67]. In particular, this has allowed for a generalization of Noether’s theorem 
that provides the variation of the charges in terms of the dissipative part of the action [67]. 


3.1.2 Formal action 


An interesting observation about the issue that was raised in the previous section is that the 
whole problem revolves around the type of Green’s function that will appear in the equations 
of motion. Apart from that, the equations one would derive using the standard variational 
principle on some Sin-out or with the modified variational principle applied on some Si n _i n , 
would be formally the same. Since the usual Sin-out action is simpler and closer to our habits, 
it would be very convenient if we could use it anyway, even if we have to rely on purely formal 
manipulations. Indeed, we could for instance decide that all E~ x occurrences inside the action 
are formal, i.e. undetermined linear inverses of □. Then, once the equations of motion have 
been computed, one should turn all the E _1 into retarded ones by hand. This is in fact a 
standard way of proceeding (see [64,65,96,97] and references therein). 

Since the difference of the convolution with two different E -1 is a homogeneous solution, 
we can give a meaning to this formal action as a functional on the quotient space of fields 
modulo homogeneous solutions of □. In this space the kernel of □ is trivial, by construction, 
and thus the equivalence class [CG 1 ] is unique. In the case of the equations of motion however, 
where homogeneous solutions matter, one has to choose the appropriate representative [□ 1 ] 
that suits for sensible physics, i.e. E” 1 . 

Now note that treating all the CG 1 as equivalent during the variation implies some important 
simplifications. For instance, this means that we can effectively integrate CG 1 by parts. Indeed 


/ d^MtrV(x) 


J d D xd D y^)(x)G(x,y)ip(y) 
j d D xd D yi’(y)G T (y,x)(j)(x ) 

J d D y^(y) cj)(y) 

J d D yijj(y)n- l (j)(y), 


(3.1.15) 


since the transposed (E _1 ) T is also a right-inverse □ (E -1 ) 7 = id (see appendix A.3.1 for the 
case EG 1 ). A related simplification is the fact that now EG 1 is also a left-inverse E _1 E = id 
since, from appendix A.3.2, we know that E _1 E is the identity up to a homogeneous solution. 
As an example, the formal action corresponding to the non-local equation (2.7.62) reads 

S = IJ d ° x [V (( D - m l) ^ wpa + { zU - sV^ pa ) V + ^r] , (3.1.16) 

where the E~ x inside the projectors are formal. Finally, note that integrating-out fields to get 
non-local formulations can now be performed at the level of this formal action. 
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3.1.3 Non-local path integral 


In section 3.1.1, the obstruction to the existence of a “in-out” action for some causal non-local 
equations was traced back to the fact that G v is not symmetric under time-reversal. However, 
this is not the case of its Feynman cousin Gp and it is the latter that appears in the path 
integral for scattering amplitudes, i.e. the “in-out” case with |\Hj n ) = |0; n ) and (’Foutl = (0 ou t|. 
Thus, there is no need for formal manipulations in writing down such a path integral for our 
non-local theories. 

For instance, we can now literally integrate-out the Stiickelbergs of the local theories, i.e. 
by integrating over them in the path integral 6 . More precisely, we can start with the path 
integral of the original local theory, perform the Stiickelberg trick, insert a gauge-fixing term 
for the gauge field, and then integrate-out the Stiickelberg field to get a non-local theory. For 
example, for Proca theory, this procedure gives 


^ expi / d o x 


- A, 


(□ _ m 2 + ie ) v ^ Av _ _L ( 5m ^) 2 + 


, (3.1.17) 


where, as we know from section 2.1.1, it is the Feynman inversion of □ which arises in the 
transverse projector V. Contrary to the case of classical physics, where the retarded prescription 
is lost inside the path integral because of symmetrization, here there is no inconsistency since 
the Feynman propagator is symmetric. The equations of motion of this action are acausal, but 
the scattering amplitudes are the ones of Proca theory, by construction. This is simply a local 
QFT with a field that has been integrated-out. Indeed, the two-point function can be computed 
by further integrating-out A^ and taking the double functional derivative with respect to the 
source. One gets 


(0|iT (7’)A,(A;)I 0 ) = ~ , 2 \ - -Vvv + (•■•) kpkv , (3.1.18) 

r k z + m z — ie 

whose physical part is thus the same as the propagator (2.7.17) with the Feynman e prescription. 
Moreover, note that the presence of Dp 1 does not constrain the fields more than in the local 
case, since the boundary conditions of the path integral are the ones for which Dp 1 is defined 
anyways. 

In local QFT one usually integrates out a dynamical field when one is not interested in 
the scattering amplitudes containing the associated particles in the “in” and “out” states. The 
important question now is whether one can proceed in the same way for the genuinely non-local 
theories, i.e. without having a corresponding local action for them. Indeed, in the case of the 
non-local formulation of Proca theory, we were sure that the non-local path integral was not 
pathological because it simply amounted to the one of a local theory with some integrated- 
out field. To make sense of a path integral corresponding to the non-local spin-2 theories 
introduced in the previous section we should first find some local formulation, and study its 
own quantization. 


3.2 Localization 

In the case of local equations, as discussed in section 2.1.2 and as shown in the case of linear 
massive gauge theories, we have that each dynamical field brings in two degrees of freedom 

6 Of course, for quadratic fields, this has precisely the effect of replacing the fields by the solution to their 
equation of motion, although with the Feynman prescription if some □ — m 2 has been inverted in the process. 


72 






corresponding to its initial value and the one of its time-derivative. In non-local field theory 
this rule does not hold anymore, and properly understanding the consequences of this fact is 
very important if we wish to settle stability issues. Of course, the notion of dynamical field may 
seem a bit ambiguous when non-localities are around, so we must first express the theory in a 
way where this terminology is well-defined. Our argumentation will be much more transparent 
if we parallel it with a simple example highlighting the important features. Consider the 
following non-local equation for some field (p with source J 

D(j)-m 4 n- 1 (j) = J. (3.2.1) 

This of course makes sense only if cp has finite past, but we can also decide that time starts at 
some finite ti, in which case the initial conditions of (p could be chosen freely'. In any case, for 
our purposes it will not matter whether the initial conditions of cp are constrained for consistency 
or not. Equation (3.2.1), although quite clear to understand, is an integro-differential equation 
and thus not very transparent as far as the dynamical content is concerned. It is therefore very 
convenient to introduce an auxiliary held ip which we define by 

ip = m 2 ^ 1 cp, (3.2.2) 

to get that the equation now takes a local form 

□i p = m 2 ip + J. (3.2.3) 

One must then supplement it with the equation satisfied by ip which, by construction, is a 
dynamical equation 

dip = m 2 cp . (3.2.4) 

Observe that this appears as the inverse of the operation of “integrating-out”, so we may say 
that we have “integrated-in” ip. However, if we now reverse-engineer and integrate-out ip, then 
the most general solution of (3.2.4) reads 

?Jj = iJ; hom + m 2 n~ 1 4), (3.2.5) 

where \3ip hom = 0 is a homogeneous solution. Note that this is (3.2.2) only in the case ip hom = 0 
and, in particular, we must have ip —>• 0 if m —> 0. Since the set of homogeneous solutions 
is isomorphic to the set of initial conditions, the definition of ip (3.2.2) constrains its initial 
conditions to be zero at t —> — oo if cp has finite past, or at ti if this is when we start the 
convolution in (3.2.1). 

In any case, we have that ip is a dynamical field, i.e. it obeys a second-order equation in 
time, but does not represent degrees of freedom of the theory, i.e. its initial conditions are 
not free to choose (see section 2.1.2 for a reminder on these definitions). Such fields are thus 
commonly referred to as a “spurious degrees of freedom” in the literature. However, as we 
will see later, their effect on the physics will be far from being “spurious”, so we will avoid 
this terminology. We will rather refer to such fields as “constrained dynamical fields”. For 
the moment, note that the local equations (3.2.3) and (3.2.4), subject to the constraints on the 
initial data of ip, have exactly the same solutions as (3.2.1), by construction. They thus provide 
a more transparent point of view on the physics, since we are certainly more used to working 
with local equations. 

7 What one should not do in this case however, is consider the times t < ti because for them the Green’s 
function will be advanced. 
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Understanding JVf / 2 N& 

We thus have that the number of dynamical fields in (3.2.1), both constrained and uncon¬ 
strained, is Nd = 2, while the number of degrees of freedom is JVf = 2, so the local field theory 
rule JVf = 2JVd does not hold. To understand where the constraints on ip come from observe in 
(3.2.2) that the information of the initial data of ip amounts to the information of the initial 
data of the Green’s function in D _1 and therefore to the choice of inversion CP 1 . Thus, this 
additional data that suddenly pop up were actually here all along. They were determining 
the choice of CG 1 we were using, while now they are expressed as initial conditions of some 
auxiliary field. 

Another way to understand this is by noting that if we do consider an arbitrary ip hom the 
effect is that the source is shifted 

J^J + rn 2 ip hom , (3.2.6) 

as we already saw when we were integrating-out the Stiickelebrgs in section 2.7. Since adding 
a homogeneous part can be interpreted as changing the Green’s function in C -1 , considering a 
ip hom ^ 0 can be interpreted as a different choice of C -1 , 8 . 

Whatever the way we choose to see this, the conclusion is that different initial conditions 
of ip correspond to different choices of C -1 in the original non-local theory and thus different 
original theories. This implies that the initial data of ip are theory-level data, in contrast with 
the initial conditions of regular dynamical fields which represent different solutions of the same 
theory. Thus, the unconstrained theory of cp and ip represents many more theories than (3.2.1), 
one for every choice of ip hom . 


Local action and diagonalization 

We can now pass to the action corresponding to these equations 


5 = / d D x 


1 


1 


- <p\3cp + - ipEhp — m (pip — (pj 


(3.2.7) 


which could have also been obtained by integrating-in ip directly in the formal action of (3.2.2) 


S = d D x 


Now one can diagonalize (3.2.7) to get 


1 ^ 

2 - -j- 4>^~ 1 <P ~ (PJ 


s = I d D x ^(p + (□ - m 2 ) <p + + ^(p- (□ + m 2 ) <P- - -^= (<p + + (p-) J 


(3.2.8) 


(3.2.9) 


where cp± = {(p ± ip) /\/2, so (p- is a tachyon. This could have been directly deduced by looking 
at the propagator of the non-local theory (3.2.1) or (3.2.8) 


m = ~ 


k 2 - 


k 2 + m 2 k 2 — m 2 J ’ 


(3.2.10) 


“More precisely, since by construction ip —> 0 if tj> —> 0, we would have that ip om is a linear functional of tp 
and the new CP 1 can thus still be written as the convolution with a Green’s function. 
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which indeed reflects the spectrum of the localized theory. The constraint on the initial condi¬ 
tions of ij} translates into equal initial conditions for <p + and </>_. In particular, if m —> 0 then 
this gives </>+ = cp- at all times, since they obey the same equation. This is consistent with the 
fact that if m —> 0 then ip —> 0. 

As already mentioned, by this “localization” procedure we obtain a bijective map between 
the solutions of the non-local equation (3.2.2) and the solutions of a trivial local field theory, as 
long as we carefully take into account the constraints on the initial conditions. The dynamical 
content is therefore clearly a healthy scalar field and a tachyonic one. Thus, non-local field 
theories “hide” constrained dynamical fields. 

Localization versus gauge theory constraints 

It is now very important to understand that this kind of constraint on the initial conditions 
has nothing to do with the constraints that arise in local gauge theories. Indeed, one of the 
reasons for spending so much time analyzing linear local gauge theories was to clearly see how 
one obtains N{ = 2N&, i.e. how the constrained fields are necessarily non-dynamical and vice- 
versa. As we have seen in more than one way, the constraints of gauge theory are encoded 
within the action, i.e. the latter is all we need to deduce them. This is most obvious in the 
canonical formalism, where half of the constraints are the equations of motion of components 
that are Lagrange multipliers, while the other half can be imposed thanks to the arbitrariness of 
these Lagrange multipliers in the rest of the equations of motion. It is thus the structure of the 
action itself, which is ultimately due to the presence of the gauge symmetry, which constraints 
the initial conditions of some fields and automatically makes them non-dynamical. Here on the 
other hand the constraints on ip are not the consequence of some equation of motion, symmetry, 
or any other particular structure. They are constraints that simply follow by the definition 
of tp as a shortcut notation for a fixed functional of (p and must be appended to the action 9 . 
It is therefore important not to confuse constraints that are due to some gauge symmetry of 
the theory, and constraints that are due to localization, especially when we deal with non-local 
gauge theories. 

3.2.1 Quantization 

Now that we have found a way of reformulating a non-local theory in terms of a local, but 
constrained, theory, we can address the issue that was raised in section 3.1.3, namely, of whether 
one gets a sensible QFT by simply plugging a genuinely non-local action inside a path integral 
without asking any further questions. We see that the problem of non-locality, which kept us 
from defining a canonical quantization, has now translated into the problem of implementing, 
somehow, the constraints of the auxiliary fields at the quantum level. 

So let us simply consider a local action with constrained boundary data a la Feynman, since 
we work in an “in-out” framework and thus compute (0 ou t|F ... |0i n ). In general the constraints 
will not concern specific fields in the diagonalized action, but rather linear combinations of 
their boundary data. Translating these into the constraints on the creation operators and thus 
on the particles, they will generally amount to projections on some Hilbert subspace. Thus, 
constraining this external particle information corresponds to considering only a sub-block of 

9 As we saw, this is nothing but the information of the “retardedness” of the Green’s function, which was also 
appended to our formal action. 
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the S'-matrix, i.e. not all the possible “in” and “out” states. In the simplest case where the 
constraints impose Feynman boundary conditions on a single field, this translates into zero 
corresponding particles on external legs. However, since the field is dynamical its propagator 
will appear in the internal lines. Let us call the corresponding particles “auxiliary”. 

Now, if the S'-matrix is in block-diagonal form and the constraints correspond to choosing 
one of these blocks, then the evolution will be unitary. Starting with no auxiliary particles 
in the initial state, no such particles are produced in the final state and thus probability is 
conserved in this subspace. This is precisely what happens in non-abelian gauge theories where 
one introduces the Faddeev-Popov particles in order to guarantee that if we start with no 
longitudinally polarized gauge bosons these will not be produced in the final state. However, 
in that case, it is the gauge-symmetric structure of the theory, ultimately leading to the BRST 
symmetry, which implies this highly non-trivial result [95]. Here there is no such structure 
for the auxiliary localizing fields 10 , so the S’-matrix will generally not be in block diagonal 
form. Thus, the auxiliary particles will be produced in the “out” state and not taking into 
account these states will mean that the evolution is not unitary. Put differently, part of the 
probability will “leak” in final states that are not part of the physical Hilbert subspace. For 
more complicated constraints on the initial and final states, analogous unitarity problems will 
necessarily occur. 

One possibility for avoiding this conclusion could be that the auxiliary particles are much 
heavier than the energies at which we are interested, so that they cannot be produced in the 
final states and evolution is unitary. Indeed, this is what happens in effective held theories, 
where some heavy held has been integrated-out 

e <SrftM ~ J D^e iS ^ , (3.2.11) 

with S'gff providing a unitary evolution in the subspace of zero <f> particles at low energies. 
Unfortunately however, in this case one usually has that the non-local operator is of the form 
(□ — m 2 ) , since the integrated-out mode is massive. By definition then, the effective theory 

is valid (S e fj is unitary) only up to the cut-off A < m 2 . Then, for such scales p, E < A we can 
expand 

(□ — m 2 ) _1 = ^ ("l + ~2 , (3.2.1.2) 

so that the effective theory cannot be non-local. 

We can thus conclude that, if we take the localized theory as the “fundamental one” and try 
to quantize it, then we have to consider all the dynamical fields on equal footing. There is no 
way in which the constraints that we impose classically may be somehow implemented in the 
quantum context without spoiling unitarity. Then, considering the classical limit of this QFT 
will result in the unconstrained localized equations of motion, thus representing more solutions 
than the ones of the original non-local theory. In conclusion, it makes no sense quantizing 
a non-local action. This is why the non-local models proposed in the literature are usually 
interpreted as the quantum effective action F of some underlying local fundamental action S, 
or as any other type of classical effective action. 

Finally, we can now answer the question raised in section 3.1.3, of whether one could 
simply plug a genuinely non-local action inside a path integral and start computing scattering 

10 The only exception are precisely the non-local formulations of local theories since then the localizing fields 
are the Stiickelbergs that are pure-gauge. 
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amplitudes. We argued that in the case of massive electrodynamics this was justified because 
it simply amounts to integrating-out a field in a local theory. Here we see that in general, the 
would-be integrated-out fields, i.e. the localizing auxiliary fields, must be deconstrained in any 
quantization scheme that preserves unitarity. Thus, the quantum theory will not have the non¬ 
local theory as its classical limit, but a larger theory. The case of massive electrodynamics, or 
of Fierz-Pauli theory, is special, in that the integrated-out fields are pure-gauge (Stiickelbergs) 
and thus do not correspond to particles in the local theory anyways. 

The bottom-line here is that all non-local models should be understood as classical theo¬ 
ries, that are therefore entirely determined by their equations of motion. This is going to be 
understood in the rest of the thesis. 


3.3 Constrained dynamical fields and classical stability 

A question of prime importance is whether a constrained dynamical field may destabilize a 
solution of interest. Indeed, in the literature, this special status has been invoked in order to 
minimize the impact of constrained dynamical ghosts on classical stability [96,97]. As we will 
now show, the impact on stability of such modes is the same as the one of ordinary dynamical 
fields. Nevertheless, note that, in contrast with the quantum context where a ghost is a fatal 
flaw 11 , at the classical level a ghost does not necessarily imply an instability. Indeed, the 
stability verdict is not obvious in the presence of non-linear effects, as we will see in concrete 
examples, so each case must be analyzed individually. 

Classical ghost impact 

Loosely speaking, a solution is “stable”, or at least “metastable”, if arbitrary small perturba¬ 
tions of its initial conditions yield solutions that are close enough to the original one 12 . Thus, 
if some field is dynamical but not a degree of freedom, then its initial conditions cannot be 
perturbed and this may affect the stability verdict. Indeed, if the unstable modes obey an 
unsourced linear equation, then constraining their initial conditions to zero implies that they 
vanish at all times and the trivial solution is stable. One could still get away with non-trivial 
initial data giving diverging solutions since, by linearity, the auxiliary field does not interact 
with the physically observable ones and thus observable quantities remain bounded. 

However, this is unfortunately not at all a realistic example, for all physically relevant 
theories contain (self-)interactions. In that case, the information of initial conditions becomes 
irrelevant. Indeed, consider the simplest example where the constrained unstable field has a 
linear source with compact support in time. As the source is turned on the field responds 
by taking a non-zero value, and thus when the source is turned off the held evolves as if 

11 Indeed, in the quantum theory, a ghost gives rise to a negative-energy state, and therefore the vacuum 
can decay into ghosts plus ordinary (positive-energy) particles, as long as the total energy remains zero. The 
corresponding decay rate is infinite because the kinematic integral is unbounded, so this instability is fatal. More 
precisely, putting a cut-off on momenta we get, by dimensional analysis, that the decay probability per unit time 
and unit volume is T ~ A(f. This actually holds for ghosts with tachyonic mass, so that the corresponding field 
oscillates and there is a notion of particle, although with negative-definite energy E = —%/p 2 + m 2 . For ghosts 
with non-tachyonic mass part of the modes are diverging instead of oscillating so in that case one cannot even 
define particles. 

12 The notions of “small” and “close enough” are of course subjective since they depend on the choice of a 
distance in field space and can be taken from either an absolute or a relative point of view. 
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it had started with non-trivial initial conditions. Moreover, the instability is communicated 
to the rest of the fields through the interactions, leading to diverging physical observables. 
Therefore, in the presence of (self-)interactions, there is no difference between constrained or 
unconstrained dynamical fields, any dynamical field matters in the classical stability analysis. 
It is not important whether some field has incoming waves at past infinity or not, these will 
be anyways generated at future infinity by its interactions. In the example given above, for 
instance, we have that the tachyonic mode (f>- makes the <f> = 0 solution of the non-local theory 
unstable. Of course, this would have been the case even if it were not sourced, because the 
initial conditions are not 4>~(ti) = = 0, but this example shows how the diagonalization 

makes the constrained dynamical modes interact with the source as well 13 . 

Comparing with other works 

The above argument allows us to understand some weaknesses in the argumentation of [96] 
and [97], which erroneously conclude that the constraints on the initial/boundary conditions 
of ghosts neutralize their destabilizing power. Let us consider each case separately. 

In their pioneering work on non-local modifications of GR for cosmological purposes, Deser 
and Woodard proposed the following simple formal action [65] 

Sdw ^1^G I + • (3-3.1.) 

In [97], where they analyze its stability, they note that, when localized, the theory has a 
dynamical ghost when / is non-linear [98]. As they correctly show, working with the non-local 
equations, this mode is not a degree of freedom since its initial conditions are fixed. More 
precisely, this is a phenomenological model in which the D^ 1 that appears in the equations of 
motion starts its convolution at some finite U. Thus, the non-local equations of motion become 
local at t = U, 14 and, being a gauge theory, some of them will constraint the initial data. As 
in the linear cases that we have studied, these are nothing but the equations of motion of the 
time-components that are first-order in time-derivatives. In [97] it is indeed found that the 
modification does not change this property, so that there are as many constraints on the initial 
data as in GR for the same field content g^ v . Thus, the degrees of freedom are the same as in 
GR 15 . From this however the authors infer that the dangerous mode is saved from propagating, 
because of the gauge structure of GR, and thus that it cannot affect classical stability. 

This statement reveals precisely the confusion that might arise in non-local gauge theory 
which we discussed in section 3.2, i.e. that one considers all modes whose initial data are 
constrained as non-dynamical ones. The constraint we have on the auxiliary scalar here is not 
a gauge-theory constraint which would automatically make it non-dynamical. Rather, it is a 
constraint that comes from the fixed choice of inversion CR 1 and thus does not neutralize that 
mode. Again, counting degrees of freedom is not equivalent to counting dynamical fields 16 in 
non-local field theory. By going to the localized formulation the situation becomes clear. The 

13 This can be expected whenever the hidden dynamical field has a corresponding pole in the saturated prop¬ 
agator of the non-local theory. 

14 That is, since the non-localities take the form f ..., they all vanish at t = ti. 

15 In the localized formulation this would have been deduced by simply noting that the initial conditions of 
the auxiliary scalars vanish at ti = 0. We will see later on a concrete example of this using a similar model. 

16 According to the definitions of section 2.1.2. 
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gauge constraints reduce g Mi , to the two dynamical fields of a massless graviton (just as in GR), 
while the localizing scalars have constrained initial conditions but remain dynamical. We thus 
have an interacting dynamical ghost that can potentially destabilize the solution of interest. 
In [96] the proposed model is rather 

5b = 16 IrG / d ° X ^ ~ aR i^ L ~ lGIM/ ] > ( 3 . 3 . 2 ) 


where L = □ + O(R). Localizing this action one finds again dynamical ghosts, and it is argued 
that they do not influence stability because of their fixed boundary data. The author even 
illustrates this argument with the following example. Consider the simplest local theory and 
turn it into a non-local one artificially as follows 


S = d D x 


2 


= / d D x 


- (D^D- 1 (□</>)- </>J 


Then localize by integrating-in another scalar 


5 = / d D x 


— ^ + -i/jOcj) — 4>J 


%t) = □" 1 D0 = 


(3.3.3) 


(3.3.4) 


and diagonalize i/j = t// + <t> 

S 


d D x 




(3.3.5) 


Of course, this ghost is only an artefact of this procedure. Indeed, its equation of motion is 
Oi/j = 0 and, for zero initial conditions, we have = 0, so integrating it out gives back the 
original local theory 1 '. With this example however, the author implies that this apparent ghost 
is of the same kind that arises in the localization of (3.3.2), and thus that the latter must also 
be harmless. This is not true because the above example precisely avoids that the ghost couples 
to the source. In contrast, in the localization of (3.3.2), after diagonalization, the ghost mode 
does couple to the source. 

A probable source of confusion is the fact that the author works in Euclidean space, in which 
case D _1 is uniquely defined on fields that vanish sufficiently fast at infinity and thus these are 
the natural constraints for the localizing fields. The fact that these are boundary constraints, 
instead of initial condition constraints, implies that whatever modulations the constrained 
field might experience in the bulk, its asymptotic values are zero. However, Wick rotating 
to Lorentz space-time we get that these boundary conditions turn into Feynman boundary 
conditions 18 , which is not the type of constraints one must impose for causal physics. Rather, 
using the retarded propagators the constraints apply on the initial conditions, so there is no 
control on the behaviour of the ghost at future infinity. As we have argued, in the presence of 
non-linearities, this mode will be generically activated. 

17 Note that this holds also on non-trivial space-times. 

18 Indeed, the trends ~ e“ tE at te —► —oo and ~ e _ “ <E at £e —► + 00 , with u> > 0, for the boundary conditions 
in Euclidean time turn into ~ e lult at t —> —00 and ~ e~ lu,t at t —> +00 in Lorentzian time, i.e. no ingoing 
positive-frequency waves and no outgoing negative-frequency waves. 
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Small summary 


The take-away message here is that the intuitive property Nf = 2N^ of local field theory has 
to be abandoned in the non-local case. There are hidden dynamical fields that appear only 
after all boxes have been put in the numerator and thus Nf < 2N&. The fact that their initial 
conditions are constrained is a consequence of the definite choice of Green’s function in the non¬ 
local theory. One has to be even more careful in non-local gauge theories where there are two 
types of constraints that should not be confused: the ones due to the gauge symmetry, which 
neutralize modes, and the ones due to the localization, which do not affect propagation. From 
the above paragraphs it is now clear that what matters for realistic physics are the dynamical 
fields rather than the ones with unconstrained initial conditions. Constrained ghosts and 
tachyons are thus as dangerous as their unconstrained cousins. We must stress however once 
more time that, because these theories are classical, the presence of ghosts or tachyons does 
not necessarily imply an instability, as non-linearities can affect their evolution non-negligibly. 
Therefore, in the presence of such modes a case-by-case classical stability analysis is required 
to settle the issue. 
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Chapter 4 

Non-local gravity 


We are now ready to consider generally-covariant extensions of the non-local field theories 
introduced in the second chapter. This chapter is based on, and extends, [68,70,71] 

Manipulating D _1 on curved space-time 

Now CC 1 is a right-inverse of □ = V^V^ 1 and therefore depends on the metric field g^. For 
the reader who is interested in the mathematical details of this operator on curved space-time 
we suggest a first look at the appendix A. An important property is that now Cfo 1 mixes the 
indices of the tensor on which it acts, just like □ does. It also commutes with the metric, in 
the sense that 

□ U~ 1 x v , ( 4 . 0 . 1 ) 

but of course the Cfo 1 operators on each side of the equation are different since they act on dif¬ 
ferent spaces. Moreover, note that there is more than one operator which reduces to Cfo 1 on flat 
space-time. For example, we have (□ — ^R) 1 when acting on scalars, (()])□ — 5"R — £ 2 Rfy 

when acting on vectors and so on. We will use the notation “Cfo 1 ” for the as yet undetermined 
generalizations of D -1 . 

For the retarded Green’s function of □ to be well-defined we need space-time to be globally 
hyperbolic, so that there exists a global time function which foliates the manifold, notions of 
past and future infinity, and of course causality. We will therefore assume that this is the case 
in what follows, even though the metric is a dynamical held, i.e. a held on which we have a 
priori no control. As it turns out, for the solutions that will interest us in this thesis, the couple 
(A 4,g) will indeed be globally hyperbolic for the time-intervals of interest. 

4.1 Constructing generally-covariant equations of motion 

We wish to generalize the models constructed in section 2.7.3 to generally-covariant theories of 
g^v Simply generalizing (2.7.62) to an arbitrary background would correspond to the theory 
of a linear spin-2 held on curved space-time, which is not what we want. Moreover, working 
with is not a good idea because the latter now corresponds to the perturbation around 
some background metric h /w = g^ — g fll/ . Not only this would make our equations depend on 
g^v, but it would also make general covariance hard to implement. 
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The obvious solution is to consider non-local combinations of curvature invariants of g^ 
and match these to (2.7.62) in the linearized limit over Minkowksi space-time. In doing so 
however the resulting equations are not transverse (under V) in general. For example, say we 
have a term of the form 

(4.1.1) 

in our equation. Perturbing around flat space-time to linear order, since [d^, IZIjr 1 ] = 0, we have 
that this tensor is transverse because is. On curved space-time however, this is no longer 
true because [V^UIjr 1 ] / 0. 

The absence of transversality is inconsistent with gauge-invariance. Indeed, the latter im¬ 
plies that some of the components of the field are not determined by the equations of motion, 
and thus translates into having less equations of motion than the number of field components. 
This is the case if the equations are identically transverse, since we have D less equations cor¬ 
responding to the D gauge parameters of the diffeomorphism symmetry. If the equations are 
not identically transverse, but we do have the gauge symmetry, then the fields that are not 
pure-gauge are overdetermined. To resolve this problem, one has two options. 

4.1.1 Projector-based models 

In the previous chapter we have identified the operators V (2.7.32) that make a tensor trans¬ 
verse. We could thus use these operators here to make the generalized equations transverse by 
hand, without affecting the linearized limit (if we choose \P). This option has been considered 
for instance in [68,73,105] and we will refer to such models as “projector-based models”. 

No closed form 

On flat space-time we were able to construct explicit expressions for the transverse operators 
V. Unfortunately, on arbitrary space-times, these operators exist but admit no closed form 
in general. This is because now the order of the differential operators matters since covariant 
derivatives do not commute and in particular [V^DjT 1 ] / 0. As shown in appendix A.3.3, 
already for an Einstein space = k , where n is a constant, we have 

V/A" 1 = (□ - k); 1 . (4.1.2) 

The only case where this is not a problem is for vectors, where one can simply covariantize the 
original expression (2.7.4) 

v; = 5" - v M n r - x v% (4.1.3) 

making only sense on vectors whose covariant divergence has finite past. Indeed, all the proper¬ 
ties of this operator, listed below eq. (2.7.4), are still valid and their demonstrations go exactly 
the same since we did not use [8^,8^ = 0 nor / 0 to derive them. It is therefore a 

projector on the transverse subspace 

VMJ = 0, aI = v;a v (4.1.4) 

and is invariant under U(l) gauge transformations whose parameter has finite past 1 . The only 
difference with the flat space-time case is that now V tw is not symmetric because [V^, D^ 1 ] / 0. 

1 Indeed, as shown in appendix A.3.2, the property [0,07 ] = 0 for fields with finite past still holds for 
globally hyperbolic space-times. 
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To understand the obstruction in constructing closed forms for transverse operators V of 
higher rank on generic space-times, let us first see how one could proceed for the vector case. 
We can start by defining the action of V through an auxiliary field A 

Al = A^-V^A, (4.1.5) 

obeying 

UA = W p A p . (4.1.6) 

Then, solving for A using the retarded D _1 one retrieves the definition of A v . Note that 
this looks very much like the localization procedure since the initial conditions of A are fixed 
to zero at past-infinity by the use of EH” 1 . This is not a surprise, since a transverse operator is 
necessarily non-local and the above procedure amounts to localizing it by integrating in A. 

Now let us try the above construction for V in the case of symmetric two-tensors. We can 
again define 

hfiv = — V (/j,h v ), (4.1.7) 

where the D components of h p obey the D equations 

Oh p + = 2S/ u h 1/p , (4.1.8) 

or alternatively 

□/i/i + + R pv h v = 2V u h up . (4.1.9) 

To solve for h p one must first solve for V p h p which, on flat space-time, would be achieved by 
taking the double-divergence of (4.1.7). Doing this on arbitrary space-time and rearranging 
the covariant derivatives in a convenient way we get 

□V^ + R^h u + ^ h^V^R = . (4.1.10) 

We now see that cannot be expressed in terms of h )W on arbitrary space-times, hence the 

obstruction for the construction of a closed form for V. Rather, it seems that one can proceed 
only in the case of an Einstein space-time R pi/ = k g /w . with k a constant 

= (□ + k); 1 . (4.1.H) 

Plugging this back inside (4.1.9) allows us to express h p in terms of h pv , 

h p = 2 (□ + k); 1 V u h up - (□ + k); 1 (□ + k); 1 V„V p h up , (4.1.12) 

so plugging this result inside (4.1.7) finally gives 

h lu = h pv - 2V (m | (□ + kX 1 ^ p hp\ u ) 

+V (M (□ + k)~ x V y) (□ + k^ 1 V p V a h pa = i V^Ka • (4.1.13) 

Indeed, specializing to flat space-time, one can then recognize the action of \V as defined in 
(2.7.32). Note that under a gauge transformation 

, (4.1.14) 
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by the defining equations (4.1.7) and (4.1.8), we have that 5h p and hj iu is invariant. 

Thus, as in the case of flat space-time, hj iu is both transverse and gauge-invariant. 

One can then generalize the whole one-parameter family of transverse operators (2.7.32). 
The transverse-traceless projector can be constructed analogously by defining 

hpv — — ^(p/O) T — 5/iyV ph p , h pu = h^,u — — g^vh , (4.1.15) 

and 

F\h p H-—— V»V v h v + R liu h 1 ' = 2 'S/ v h Vil . (4.1.16) 

Solving for /i p on an Einstein space-time one then gets 

h lu = - 2V (H (□ + rc) r V p h p]u) + ^ V p V a h pa 

+ ^ V (p (□ + K ) r V,) + ^±1 k) V p V^ = o W > (4.1.1.7) 

which reduces to the action of oT 7 on flat space-time (2.7.32). Again, hJ t J is invariant under 
both (4.1.14) and 5h pi , = —g^O, which is the generalization of (2.7.35). Finally, note that 
iV and oV are M-linear operators even when they cannot be described in closed form, as is 
easy to check using their definitions involving the auxiliary fields. Thus, the projector on the 
transverse-pure-trace part can be defined using (2.7.33) 

, (4.1.18) 

and the generalization of a V ■ h is 

o,hJ lv = hJJ + ahl . (4.1.19) 


Origin of the obstruction 

The origin of this limitation to Einstein space-times can be traced back to the “pathology” 
of linear higher spin theories [99] of not being able to preserve their gauge symmetries on 
backgrounds that are not Einstein [100-103], 2 * * . Indeed, in the vector case s = 1, the Maxwell 
action generalizes straightforwardly to arbitrary background 


5 


& D Xy/^g 


-\g llv g prT F lip F lMJ + Apf 


(4.1.20) 


which is still U(l)-symmetric, and the equations of motion are thus covariantly transverse (for 
a covariantly conserved source) 

V v F pv = -f, (4.1.21) 

since 

V #1 V I/ F' H ' = V [p V,]F^ = R pv F pv = 0. (4.1.22) 

2 Simply put, unlike in the case of differential forms, the presence of symmetric pairs of indices when s > 2 

forces the use of in the action. This in turn implies that the gauge symmetry also depends on V and can 

therefore not be achieved on arbitrary space-times. 


84 









This implies that they can be written as a differential operator composed with the transverse 
projector V p acting on A p . Indeed 


\JA 


T 


KAi 

fd v 


UA t a - U\/ p U~ 1 y u A u - R pv A v + R^V^VpAP 

UAp - V,y v A v - [□, V M ] D" 1 - R^A U + i^D" 1 V P A" 

UAp - VpVvA" - R lu/ V v D- 1 V v A v - R pv A v + R"V v D- 1 V p Af > 

UAp - \7pS7 V A V - [V„, V,] A v - R pv A v 

DAp - V„V v A v = VuF^ , (4.1.23) 


so the equation of motion can be written 3 


[8;n-R"]A]; = -jp. (4.1.24) 

The Ricci term makes the square bracket commute with the divergence operation, which then 
gives zero when acting on A T . Thus, the existence of a gauge-invariant action is related to the 
existence of a closed form for the transverse projector that can be read out of the equations of 
motion. In the case of higher-spin fields, if there existed such a closed form for V on arbitrary 
backgrounds, then one could construct gauge-invariant equations of motion, in closed form, 
and thus deduce a gauge-invariant action. This is why there exists no closed form for a'P/j.if 7 
on arbitrary space-times. 


4.1.2 Action-based models 


The other possibility for constructing transverse equations of motion, considered for instance 
in [65,75,96,104], is to start with a generally-covariant (formal) action. Indeed, say we have 
such an action for pure gravity 

S = j d D x yf^gL[g\ , (4.1.25) 

where the Lagrangian L is a scalar. Then, performing an infinitesimal (active) diffeomorphism 


Sg= -L ig P v = -i p d p gP v + g pv d p ^ + g llp d p C = 


(4.1.26) 


we get 


5S = 


d D x 6g lw 


5gi llJ 

-2 J d 


^y/~9 L ) =2 J 

1 S(y/=gLY 
5g^ 


1 


(4.1.27) 


Since diffeomorphisms are a symmetry of the action we have that 5S = 0, for any g pi/ and 
so that 

1 KV~gL) 


X7P 


= 0 , 


(4.1.28) 


,V=9 8 9 iju 

is an identity, independently of whether S is local or not. We thus see that the utility of the 
formal non-local actions, defined in section 3.1.2, is not only ornamental anymore, it has become 


3 In the Proca case the equation of motion in this form is simply modified by □ —> □ — m 2 . 
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a valuable tool in deriving transverse equations of motion. Note that the ad hoc prescription of 
turning all the CU 1 into retarded ones at the end of the variation does not spoil transversality. 
Indeed, the latter being a local property, it cannot depend on the choice of Q -1 , since what 
distinguishes all these operators is non-local information, i.e. the boundary/initial data of the 
Green’s function. All that matters is that UR 1 is a right-inverse of □. We can therefore safely 
apply our variational principle on the formal action. 

We stress one more time that formal actions should not be given any physical meaning. 
Their variation gives rise to non-causal equations of motion, which we make causal by hand 
afterwards. Moreover, remember that non-local theories are classical theories 4 , so all the infor¬ 
mation lies in the final, causal, equations of motion. 

Finally, now that □ depends on the metric, we need a formula for the variation of CR 1 with 
respect to g^ at the level of the formal action. To compute this, we use the same logic as in 
appendix A.3.3. We apply the variation on □□~ 1 = id to get 

(anp^ + ean-^o, (4.1.29) 

and then apply CR 1 from the left to isolate the quantity of interest 

an -1 = -□" 1 (aQ)Q“ 1 . (4.1.30) 

The above equation holds modulo homogeneous solutions, which is indeed the level at which 
the variation is performed for formal actions. 

Example 

Now that we have all the necessary tools let us work out the simplest example 

S =\J d D x \f—g RO~ 1 R . (4.1.31) 

Using SR = (R/j,v + g^uO — V^V^) 5g ^ v , integrating by parts at will and sending UR 1 —> D” 1 
at the end we get 

g^ivR— V^Vt/Dj. 1 R+G[ 1U \ 1 -^) VjyDj. 1 R— — gfu/ (VpD r 1 i?) V p d r 1 R. (4.1.32) 

Let us now check the transversality of this expression. Taking the divergence and using 
[□, Vp]</> = R^ u S/ u (j) 1° simplify the second term 

av u a~ 1 R = R^V'a^R + VvR, (4.1.33) 

we get zero indeed. From this example one thing which is obvious is that the equations of 
motion of a simple non-local action will usually be rather complicated. There is thus also a 
practical advantage in describing the model through a formal non-local action, that is, being 
able to display its information in a compact way. Remember however that, since we have 
replaced by hand D _1 —> D" 1 , these are not the equations of motion of this action, i.e. 5S ^ 0 
around these solutions. 

4 Indeed, as discussed in section 3.2.1, one cannot quantize a non-local theory without either enlarging the 
set of solutions in the classical limit, or losing unitarity. 


86 



4.1.3 The necessity of considering the scalar mode 

Before we proceed to the construction of the generally-covariant transverse equations, we can 
already note one limitation of our procedure. Indeed, it appears that the non-local formulation 
of Fierz-Pauli theory will not be generalizable as wished. Remember that FP theory can be 
expressed as (2.7.55) which corresponds to (2.7.62) with z = 0, m s = 0 and with the source 
T fw replaced by its transverse-traceless part . This theory has an extra gauge symmetry 
(2.7.35) which is responsible for neutralizing the trace mode. In the non-linear context, the 
natural generalization of this symmetry is the conformal transformation 

9fiu ~> e 26 . (4.1.34) 

Thus, in order to keep this field non-dynamical in the non-linear theory we need the latter to 
be conformally invariant as well. This is however impossible to implement for the following 
reasons. 

Although we do have building blocks that are generally covariant, the curvature tensors, 
the only one which is also covariant under (4.1.34) is the Weyl tensor. A first disadvantage 
is then that an action made exclusively out of the Weyl tensor could hardly be considered as 
a deformation of GR. One should then use the CR 1 which transforms homogeneously under 
conformal transformations. For instance, in the case where □ acts on a scalar, i.e. □ = □ — £72, 
the covariant choice is £ = (d — l)/4 d and the transformation is 

□ —* e 2 d \Z\e 2 R (4.1.35) 

Using the conformally-covariant CR 1 for tensors of rank 4, the only action made of the Weyl 
tensor and CR 1 which gives the non-local FP theory in the linearized limit is 

/Lf 2 r 1 ( TTl^ 1 \ 

S = — j d D x^~g IWg I 1 - -J- j W + 0(W 3 ). (4.1.36) 

However, because of the fixed masses, here M and m g , conformal invariance is still not achieved. 
Indeed, even if the transformation of each term is homogeneous, in the presence of a fixed mass 
there remains an overall exponential factor e c0 , 5 . The same happens in a projector-based 
equation, i.e. the terms that come with different powers of mass do not transform with the 
same powers of e e , 6 . On top of this problem, note that the coupling to matter should also 
be made non-local in order to be conformally-invariant, so that the source is yet another 
challenge. This is why we have also considered the non-local theories that include the trace 
scalar (2.7.62), but in a healthy way, so that we do not need to implement conformal invariance. 
From now on we will only consider these models. 

5 These overall factors are not seen in the linearized limit because they multiply second-order terms in the 
action, or first-order terms in the equations of motion, and thus reduce to e cB —> 1. 

e The above problems could be resolved if we replace the fixed masses by a scalar field (p sitting in a non-trivial 
minimum of its potential and transforming homogeneously under (4.1.34) 

(P ^ 6 (p. (4.1.37) 

This allows us to use all the curvature invariants, since we can compensate their inhomogeneous transformation 
with the one of the kinetic term of <p, while at the same time there are no fixed masses and thus no leftover 
exponential factors under (4.1.34). The problem now however is that we have one more dynamical field <p and 
the gauge symmetry either neutralizes the latter or the scalar mode in g^, not both. Thus, we still have one 
more dynamical field than what we started with. 
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4.2 


Action-based models 


4.2.1 Constructing the action 

We now wish to construct an action-based generally-covariant extension of the model (2.7.62) 
introduced in section (2.7.3). The formal action corresponding to (2.7.62) is 

S =IJ V [(□ - ml) oV pvpa + (zD - mf) s V pvpa ] h pa , (4.2.1) 

and now h pu is interpreted as the perturbation of some metric around the Minkowski one 

M 

h[M/ = — ~ Vfiv) i (4.2.2) 

where M = ( 87 rG )^ 1 / 2 is the reduced Planck mass in D = 4. The only terms that contribute 
to the linearized action are those linear and quadratic in curvature. A general enough action 
to match (4.2.1) at that order is 


S2 = tV iDx ^ 


R+^ROiR- 2 R^OiRT + \ 


J 2 


where the 0 ,; are operators of the form 

a* = Ajizr 1 + b ^- 2 , 

and Ai,Bi are constants. An alternative parametrization that will be useful later is 


S 2 ='A fd D xV=~g 


R+^ROiR- 2 R^&iRT + 1 


where 


Wfiupa — Rpupa d_l {9p[pRa]v gv[pRa]p) + ffpIpScr] 

is the Weyl tensor and 

o 2 = 02 -M T 03 . 

We can then write (4.2.3) as 


J 2 


uR, 


S =: I d D xh llv K pvpa h 


[iv ,c pa 1 


(4.2.3) 

(4.2.4) 

(4.2.5) 

(4.2.6) 

(4.2.7) 

(4.2.8) 


to find 

K pi/pa = ( 2(03 - 0 2 ) + C -1 ) rf (j, rf )v U 2 - (2(0 3 - 0 2 ) + D -1 ) (g^ p d a) d u + v ^ p d a) d p ^ □ 

+ (2(0 2 - 0i) + O' 1 ) (rj pu d p d ,T + ri pa d p d v ) □ - (2(0 2 - 0i) + CT 1 ) if u r] pa D 2 
+2 (0i - 20 2 + 0 3 ) d p d u d p d a . (4.2.9) 
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By diffeomorphism invariance, /C is transverse so it must be a combination of a V operators. 
Equating this to (4.2.1) we get 


( 2 (e >3 - 0 2 ) tet 1 ) Q 2 = □- 

1 

d 


m n 


{2(o 2 -o 1 ) + d~ 1 )d 2 = l(n-m 2 g )-- d (zn-m 2 s ) 


(4.2.10) 

(4.2.11) 


and the solutions are 

01 = 


^3 + 1 - 


D-z 

2d 


□ -1 + 


B 3 + 


Dm 2 — m 


2 

g ■"'s 


2d 


□ 


-2 


0 2 = ^ 3 n- 1 + 


m„ 

B 3 + —^ 
d 2 


□ 


-2 


(4.2.12) 


or alternatively. 
01 = 


d 2 — d — 1 D — z 

A3 + I- 


d(d — 1 ) 2d 


□ _1 + 


d 2 - d - 1 Dm 2 g - m 2 
B 3 + 


d{d- 1 ) 


2d 


□ 


-2 


d — 2 , d — 2 mi 

ft = + b s > + Yh 


-2 


(4.2.13) 


We have two equations for three operators, which is due to the fact that one can add an 
arbitrary operator 0 to all the 0j simultaneously without changing the linearized S. This is a 
consequence of the fact that the Gauss-Bonnet-like combination 


d D x [ROR - AR^OR^ + R^ prj OR^ pa ] , 


(4.2.14) 


is a total derivative at the linearized level for all 0 if [5, 0] = 0. It becomes however non-trivial 
when 0 is an inverse differential operator at the non-linear level, even for 0 = 4, because then 
[V,^^ 1 ] 7 ^ 0. Now that we have expressed the linear action in terms of curvature invariants 
we can easily generalize it to a fully non-linear theory. 


4.2.2 Curvature expansion 

There are two types of modifications that can occur in generalizing the above theory. The first 
one is the same as in the local case, i.e. one can add arbitrary local terms that are higher order 
in curvature. Since R^ U pa is dimensionful, these terms come with associated mass scales which 
control the scale at which they influence the physics. Thus, as long as we work at scales larger 
than the smaller of these masses, the lowest order terms are more than enough. This is the 
principle of effective field theory, which allows one to consider the most general possible action, 
compatible with the symmetries of the system, at the energies of interest. 

The second kind of modification is that one can add arbitrary non-local terms that are higher 
order in curvature. Unlike their local counterparts, these need not have higher mass dimension. 
They can actually be dimensionless, such as EU 1 /? for instance, or even have negative mass 
dimensions, such as E] _ 2 0. This means that their coefficients can have zero dimension, in 
which case they cannot be neglected for “natural” 0 ( 1 ) values, whatever the scale, or positive 
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mass dimension, in which case they dominate the low-energy physics. As a matter of fact, 
in non-local field theory such power-counting arguments are more limited, because a ~ D -1 
term can dominate at large space-time scales, because of the cumulative effect of the integral, 
without necessarily having an overall negative mass dimension. 

From the point of view of effective field theory, this is a drawback of non-local held theories, 
i.e. symmetry alone does not reduce the terms that are relevant for low-energy physics to a 
finite set. From the point of view of the phenomenologist however, this can be seen as an 
advantage, since one has many different possibilities for modifying the infrared physics. 

We thus see that by abandoning locality we gain access to way too many non-linear theories 
and thus need some more input in order to select a given subset. For simplicity we will only 
consider theories that are second-order in curvature such that there are no terms which do not 
contribute to the linearized theory. Moreover, we will not consider terms involving derivatives 
of curvature tensors, such as 

(V^fi- 1 ^) V u O~ 1 R ^, (4.2.15) 

for instance. Their inclusion could be very interesting, but as we will argue later, they will not 
influence our results qualitatively. With these simplifications, we are then left with A 3 and B 3 
as unknown parameters, as well as the operators Cl] -1 . 


4.2.3 Choosing A 3 and B 3 
The Ricci model 


From the purely theoretical point of view, the most elegant and simple model is the one with 
no Riemann tensor terms in the action, i.e. A 3 = B 3 = 0. The action can then be conveniently 
written 


Sr = 


M 2 

~Y 


d D ®\/ = 5 


R + - R [zn- 1 + m 2 n- T 


R - m 2 g R^U- 2 R^ 


(4.2.16) 


where 


_ z + d — 1 
= 2d 


m 


Dm 2 — m 2 
2d 


(4.2.17) 


and we will refer to it as the “Ricci” model. A nice feature of this model is that it shares all 
the empty space solutions of GR, such as the Schwarzschild and Kerr solutions, whatever the 
value of the masses. Indeed, since the departure from GR is made of terms quadratic in the 
Ricci scalar and tensor, we have that every term in the equations of motion will have at least 
one Ricci tensor or scalar, so that all of them vanish when = 0. This should be contrasted 
with local massive gravity, where the stationary black hole solution is modified in a non-trivial 
way, as mentioned in the introduction when we discussed the Vainshtein mechanism. 


The Weyl model 

Remembering that our aim for constructing such theories is to account for dark energy, we 
should now see what background cosmology has to say about the A 3 and B 3 parameters. In 
this context, since the Weyl tensor vanishes for the FLRW metric, only the terms involving the 
Ricci scalar and Ricci tensor matter. Moreover, for the energy scales of late-tinre cosmology, 
the “past infinity” of the period of interest is the radiation-dominated era in which case R = 
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0. Thus, in that case R has finite past, while R pu does not, so it is a natural condition 
to impose that all CR 1 act exclusively on R and W /lvpcr . If this were not the case, as in 
the Ricci model, one would have to choose an initial time t t to begin the convolution with 
the retarded Green’s function. One could then adopt an effective theory point of view and 
say that at earlier times the energy is above the region of validity of the theory, so that 
the latter makes sense only for t > t*. Nevertheless, one would still remain with a non-trivial 
dependence of the history of the universe on that time ti , and with no particular way to privilege 
a given choice. Most importantly however, in practice the ~ terms do not offer a 

viable cosmological background evolution because they generically give rise to diverging modes 
[72,73,106,107]. Therefore, although the Ricci model may have its theoretical advantages, it 
is not phenomenologically viable. With the [Zl —1 ’s acting only on R and the Weyl tensor we 
avoid these conceptual and practical worries and have a well-defined convolution. 

Another advantage of this prescription is that the beginning of the matter-dominated era 
marks the beginning of the non-local memory effect since this is when HR 1 !? starts recording 
the past. This is a cumulative effect and can become non-negligible at considerably later times. 
Therefore, in this scenario one obtains an elegant alleviation of the coincidence problem, since 
dark energy appears as a delayed effect of the matter-radiation transition. This was actually 
the original motivation for the Deser-Woodard model (3.3.1) [65], to relate the dark energy 
scale and timing to an earlier event in the history without having to introduce a new fixed 
scale. Here we also consider such fixed mass scales but the spirit is the same. 

Given the above considerations, we fix A 3 and B 3 so that the R 2 W terms drop in the Weyl 
representation of the action, i.e. so that O 2 = 0. Given (4.2.13), we get 


and thus 



A 3 = 0, 


b 3 


d-lfflj 

d -2 ~~ 2 ~ 


d D x \/—g 


R+\r [ZO- 1 - m 2 R U- 2 ] R- 1 - rn\ ¥ W twp(J U- 2 W >wim 


(4.2.18) 


(4.2.19) 


where now 


„ m 2 + (d - 2)m 2 s 
mR = 2d(d-2) 


m\Y — 


d - 1 rn 2 
d- 2 ~ 2 ~ 


(4.2.20) 


We will refer to this as the “Weyl” model. In contrast with the Ricci model, this model does 
not have the vacuum solutions of GR since the Weyl tensor is precisely the part of the curvature 
which is non-trivial in this case. Finally, note that both the Ricci (4.2.16) and the Weyl (4.2.19) 
models reduce to GR in the massless limit only if Z = 0, which translates into z = 1 — d and 
thus implies that the trace scalar is a ghost, as already noted in section 2.7.3. 


4.2.4 Localization 


Here our expressions will be simpler if we rather use an alternative reduced Planck mass 
M = (I 67 rC) -1 / 2 , instead of M = (87 rC) -1 / 2 . 


Weyl model 

Let us first consider the Weyl model (4.2.19). Since R and W /wprT are independent components 
of the Riemann tensor, we have to consider a localizing field for each one of them. One 


91 







possibility is 


S w = I d D Xy/^g 


M 2 R + M(j)R + 


2 m 2 R 


v \ 1 i 

U<j>--MR\ + MW livpa <r pff + ^ 


w 


Indeed, integrating them out using the following solutions 

0 = M D^R - m 2 R D~ 2 R ) , 


^[ivpa — M ( myy\Z\ "Wiu/pa) i 


~ 2 
(□0//;^prr) 

(4.2.21) 

(4.2.22) 

(4.2.23) 


we retrieve (4.2.19). It is obvious that 4> P upa has the same symmetries as the Weyl tensor 


0/n/p(T — 0/rvcrp — 0i//ipCT : 

(frpvpo 4“ typpav (frp.avp — 0 j (4.2.24) 

r^ = o, 


corresponding to the Young tableau 


(4.2.25) 


Note that (4.2.21) is a higher derivative theory both for the auxiliary fields and for gravity. To 
gain more insight, let us integrate in two more auxiliary fields in order to lower the derivative 
order of the 0’s 


S w = / d 


Z 


m 


M Z R + Af(0+-0)i2-0D0 - YT 0 2 


(4.2.26) 


m 


+Mw, upa r upa - K.pa^r^ - ^ ^pup*v vpa 


The 0’s carry the information of the initial conditions of the second and third time derivatives 
of the 0’s, so they are also constrained, even though integrating them out does not require 
inverting □. We see that the action has become linear in the 0’s. Integrating the latter out 
and choosing the solutions 

0 = MET 1 #, Wp<x = MU~ l W pvpc , (4.2.27) 

gives back (4.2.19). We could have started with this simpler localization 7 , but this might have 
misled us to think that the initial conditions of the 0’s are arbitrary, since they are not a priori 
determined by the equations. With this procedure, we see explicitly that actually both the 0’s 
and the 0’s are constrained. As a check, note that for uir = rn\y = 0 and Z = 0 we recover 
GR. The action becomes linear in the 0’s and thus their equations of motion 


□ 0 = 0, □0 AtI /p<7 = 0 , 


(4.2.28) 


imply that the 0’s vanish since they have no homogeneous solution. The action then turns into 
the Einstein-Hilbert one. 

1 Indeed, the direct use of Lagrange multipliers to enforce relations among fields is rather the usual procedure 

[75,76,98,109-116]. 
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Ricci model 


Let us now localize (4.2.16). Although this model is not phenomenologically viable as far as 
cosmology is concerned, because of the presence of the Ricci tensor, it is interesting to consider 
it as well for its theoretical properties. Here we can consider a single localizing held (j )^ w , since 
R is the trace of R^ v . Going directly to the second-order formulation, we get 


Sr = / d n x 


M 2 R + MR 


fllZ 


Z 


<\y v + j ) - (t>^V v - mgVv 




(4.2.29) 


4.2.5 Ghosts 

At the linearized level integrating in a vector and a scalar would have been sufficient in making 
the action local. This is because the non-local operators acted on lower-rank tensors such 
as d u h u/1 . Here we see that, since the non-local operators act on curvature invariants, the 
localization necessarily involves tensors of rank two or more. Thus, the dynamical content of 
these theories is quite larger. Most importantly however, some of these fields are ghost-like. 
Indeed, the first hint lies in the fact that, if we diagonalize a term of the form (/>□?/>, we get 

= ($ + 40 □ ($ - 40 = - TDT . (4.2.30) 

This is of course not a rigorous proof because one should first linearize/diagonalize the full 
action and only then compare the signs of the kinetic terms. However, this procedure is not 
possible in general without reintroducing non-localities. So let us try in the simplest case. 


The m g = 0 case 


Let us consider the action m g = 0, in which case we only have the auxiliary scalar sector and 
the Ricci and Weyl models become the same. Then, linearizing over Minkowski space-time 
which, given the constraints on the scalars, is the solution 


fj,w = ’n, w , <f> = ip = 0 , 

using (4.2.2), (4.2.17), (4.2.20) and the following redefinitions 


hi/v — h,.,, 


V2 


— '"[tv d _ I ^ 

one gets the diagonal action 


Z . 


f' 


d- 1 




(4.2.31) 

(4.2.32) 


5 = 


d D x \f-~g 


\ dptpd^ip - (zd^d^il) + m^ 2 ) 


+h' gu T^ - V2 


dh* + - d ^ T 


(4.2.33) 


Now 4> and '0 also couple to T llL , and thus contribute to the saturated propagator. We thus 
retrieve the structure of (2.7.68) in the m g —> 0 limit, up to a field normalization, i.e. the 
auxiliary scalars <f> and ^ correspond to the scalar poles. More precisely, (f> corresponds to the 
healthy scalar pole which is responsible for the vDVZ discontinuity between the FP propagator 
and the one of GR when m g —> 0, i.e. it is the longitudinal mode of the massive graviton which 
does not decouple. On the other hand, i/j corresponds to the trace scalar with mass m s and is 
healthy when z > 0. 
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The rn g / 0 case 


So what about the auxiliary tensor modes in the m g ^ 0 models? A first argument supporting 
the presence of ghosts is that there is no particular kinetic structure that would neutralize the 
time-components which come with the wrong signs. Indeed, the actions of linear tensor theories 
are ghost-free only in the presence of quadratic combinations that provide a gauge symmetry 
which kills the ghost modes [99] . Here it seems that non-local terms which mix the tensor indices 
non-trivially, such as the example given in (4.2.15), could arrange this situation by providing 
the necessary structure. As already noted earlier however, it is a notorious problem that higher- 
spin actions cannot maintain their gauge symmetries on arbitrary backgrounds [100-103]. This 
is why we did not consider terms such as (4.2.15) in our action, because they cannot resolve 
this ghost problem anyway. 

On top of this issue, which concerns each diagonalized tensor held separately, we also note 
that in the scalar case the ~ Z term is crucial in making the action ghost-free. Since there is 
no analogous term in the tensor sector, we expect that the diagonalized fields will exhibit a 
ghost/non-ghost structure like (4.2.30). The corresponding new poles are indeed also present 
in the saturated propagator, since the diagonalization will inevitably make the auxiliary fields 
couple to T^, it is just that they will add-up with the tensor structure of h^u, 8 9 . The tensor 
part of the propagator (2.7.67) is thus the sum of these three contributions h, cf>, ip, and only the 
result has the correct sign. We will therefore have poles with the wrong residue signs for the 
ghost modes, but since the sum must be healthy, these will necessarily be canceled by healthy 
poles 


1 1 

k 2 + m 2 k 2 + m 2 


(4.2.34) 


This is why these modes can be missed when working directly at the level of the non-local 
theory, they simply cancel-out in the propagator^. This is also why the propagator only provides 
a lower bound on the number of dynamical fields, because there might by cancellations among 
the corresponding propagators in the presence of ghosts. 

Now note that this cancellation occurs only classically and only at the linearized level in 
the propagator. More rigorously, classically the retarded e prescription for (4.2.34) gives 


lim 

e—>0+ 


(k° + ie) 2 + k 2 + m 2 —(k° + ie) 2 + k 2 + m 2 


= 0 , 


(4.2.35) 


so we do have a cancellation. Indeed, since e displaces the poles in the integral over k° it is their 
relative sign that matters for having a retarded response, not the overall sign of the propagator. 
The prescription is therefore the same for both healthy and ghost fields. 

Quite interestingly, such a cancellation would not occur in a local QFT with a ghost/healthy 
pair (4.2.34). Indeed, for scattering amplitudes where it is the Feynman propagator that arises, 

8 In the Weyl model the tensor structure will correspond to the one of a 4-tensor, but since the source is a 
2-tensor, the saturated propagator will reveal the same type of structure as the one of h M „. 

9 This is similar to what happens in Barvinsky’s non-local theory (3.3.2) [96]. Indeed, the linearized action 
is the one of GR, and has thus a healthy propagator, but the non-linear localized action contains an auxiliary 
tensor on top of the metric, and the latter has obviously ghost modes. We thus have that the propagator 
of the diagonalized/localized theory has ghost poles that are compensated by healthy ones, as is clear from eq. 
(28) of [96]. Thus, the ghost propagator simply appears to shift the graviton propagator, canceling it exactly 
for a = 1. 
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the e prescription comes from the modification of the path integral which makes it converge. 
This means that unitarity forces the choice 


5 , 


aux.scal. 



^ 4>i (□ - m 2 + ie) $1 - ^ $2 (□ - rn 2 - *e) $2 


(4.2.36) 


for the kinetic terms of the diagonalized auxiliary fields, which in turn translates into 


lim 

e— >0+ 


i 

k 2 + m 2 - ie 


i 

k 2 + m 2 + ie 


lim -—- 

e-?-o+ ( k 2 + m 2 ) 2 + e 2 


2ir5 ( k 2 + m 2 ) , 


(4.2.37) 


for the Feynman propagators. Unlike the case of the retarded propagator, now the sign of e 
is always positive and the two terms do not cancel each other. Rather, the result is the real 
part of the Feynman propagator. Had we chosen the opposite sign for e in the propagator for 
the ghost, we would have lost unitarity but the ghost would have propagated positive energies 
forward in time, like an ordinary particle [118]. 

Coming back to the classical case which involves the retarded propagators, the above ar¬ 
gumentation only implies that the corresponding forces between two linear sources will indeed 
cancel-out. At the fully non-linear level however, these pairs of dynamical ghost/healthy fields 
will generically have different interactions and will thus be excited by sources in a non-trivial 
way. We therefore have potential tensor instabilities as soon as m g ^ 0. This leaves only the 
massless gravity theories m g = 0 as the only potentially ghost-free theories. In the present 
form these are not scalar-tensor theories because the scalars are constrained, as also noted 
in [72,74,75,94,97,116], but their dynamical spectrum is the one of a scalar-tensor theory. 


Condensation? 

Note that what the above argumentation tells us is that the Minkowski solution may be per- 
turbatively unstable, nothing more. Indeed, it may very well be the case that there exist other 
highly symmetric solutions, such as FLRW ones, around which the perturbations are healthy. 
One then says that the ghosts “condense” onto a solution around which the fluctuations have 
positive-dehnite kinetic energy, in the same way a tachyon condenses on a non-trivial minimum 
of the potential. The idea of ghost condensation has already been around for a decade as an in¬ 
teresting mechanism for addressing the dark energy and other cosmological problems [117]. As 
in the case of tachyon condensation, one typically needs higher-order terms in the derivatives, 
which would here correspond to higher order-terms in curvature in the non-local formulation. 

Unfortunately, for the Weyl model, which is the phenomenologically viable one, and for 
the case of interest where the stable solution is an FLRW solution, ghost condensation is 
not possible. Indeed, homogeneity and isotropy, along with the symmetries of the 4-tensors, 
force the latter to vanish on such space-times. Then, since a ghost held acquires a non-trivial 
background value when it condenses, by definition, we have that this cannot be the case for the 
auxiliary 4-tensors. Thus, we do not believe that the ghost could condense in the Weyl model, 
unless the stable solution is not homogeneous or isotropic. 


4.2.6 Stability 

As already discussed in section 3.3, the impact of ghosts in classical physics need not be so 
radical as in the quantum case. Indeed, classical instabilities can be dealt with if they are 
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slow enough to pass phenomenological tests, or if they are stabilized by background/non-linear 
effects. 


Non-tachonic ghosts 


If the mass of the ghost is non-tachyonic, we have that the corresponding dispersion relation 
will be 


(jO = 



(4.2.38) 


so that only the modes at cosmological length-scales |fc| < rri are going to be unstable. Moreover, 
the maximal frequency of these modes being ui = m, we have that the corresponding divergence 
will manifest itself at cosmological time-scales At ~ m~ l ~ i.e. of the order of the age of 
the universe. Also, since these modes start at zero, they remain much smaller than one during 
the whole At period in which case our linear analysis is sufficient. Therefore, at scales where 
these instabilities are observable Minkowski space-time is not the appropriate solution and the 
solar system/galactic physics are effectively stable. The stability analysis will be important in 
the context of cosmological perturbation theory where the above dispersion relation argument 
is not enough anymore, since large space and time scales will be involved. We will come back 
to this when we will discuss the cosmological phenomenology. 

The typical example of such non-tachyonic ghost will be the scalar mode in the case Z = 0, 
where one retrieves GR in the massless limit and thus does not spoil solar system constraints. 
Indeed, we will see that in this case the viable models are the ones with m 2 s > 0. 


Tachyonic ghost 


Finally, for ghost modes that are also tachyonic, i.e. that obey the dispersion relation 


oj = 


± 



(4.2.39) 


but have a negative energy at the linearized level, there is no divergence in the absence of 
interactions. Thus, in this case one must also include the non-linearities to pronounce the 
stability verdict. Tachyonic ghosts are expected in the auxiliary tensor modes since cancellation 
forces them to come in combinations such as (4.2.34), where it is the overall sign that is wrong. 


4.3 Projector-based models 

4.3.1 Constructing the equations 

We now wish to construct a projector-based generally-covariant extension of (2.7.62). As for the 
action-based generalizations, here too one has access to a plethora of combinations of curvature 
invariants and non-local operators. We will again consider only terms that contribute to the 
linearized equation over Minkowski space-time and no derivatives of curvature invariants. This 
is a bit more restrictive than in the action-based case since it gives 

Gfiu + ol (9fiuR) T + P a (3 1 G f iu) +7 (dfivO 1 R ) = 8-kGT^ . (4.3.1) 

Note that for the pure-trace terms ~ g^K we have that the transverse part is uniquely defined 

(g^K) TT = 0 , 47 (g„vK) T = (g^K) TpT . (4.3.2) 
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For the \ZG l Gp U term we have one more free parameter a which corresponds to the choice of 
transverse operator a V. Note that choosing another combination of Rp V and g^R instead of 
simply amounts to changing 7 , thanks to the M-linearity of the transverse projectors. 

A first remark on this class of models is that they share all the vacuum solutions of GR, 
just like the action-based Ricci model (4.2.16). Indeed, if R pv = 0 then, by M-linearity of V, 
the left-hand side of (4.3.1) vanishes. Let us now fix the free parameters such that we retrieve 
(2.7.62) in the linearized limit. Linearizing over Minkowski 


g^i/ — T][n/ 2A h^ u , A — y/H ttG , 
and using (2.7.36) one gets that (4.3.1) reads 

^ (l+d(2a— 1)) R nif hp a + /3 (2d^lfi— a(d— 1)) R gif ^ per ■ 

Using (2.7.36), we can also rewrite (2.7.62) as 10 

^ zP’[Ilf hpa — m g ml/m'zfPpif = ~\T gv , 

so that, matching the two equations, we get 


(4.3.3) 


(4.3.4) 


(4.3.5) 


1 + d(2a — 1 ) = z , /3 = —m 2 , 


Keeping a as the free parameter, we then have 
z + d — 1 


2dl-a(d-l) = r ^ 

p mi 


a = 


= Z. 


(3 = —m 


2 

g ’ 


7 = 


ml + a(d — 1 )m 2 g 
2d 


(4.3.6) 


and thus the generally-covariant extension in terms of Z, m s , m g , a reads 

t o / ~ i x T ml + aid — 1 )m 2 r . . ~ , , t 

Gp V + Z ( g gv Rf - m 2 g a (g^U^R) = 8vr GT gu . 


(4.3.7) 


(4.3.8) 


As in the case of action-based models, only the Z = 0 case reduces to GR in the massless 
limit, but the price to pay is a scalar ghost in the spectrum. Moreover, as also discussed for 
the action-based models, the term involving the Einstein tensor in the departure from GR is 
phenomenologically excluded since it leads to non-viable FLRW solutions [72,73,106,107]. This 
in turn implies that m g = 0, i.e. that the tensor modes are massless. This is in contrast with 
the action-based models, where the possibility of considering Weyl tensor terms allowed us to 
have m g ^ 0 without affecting the FLRW solutions 11 . We are thus led to consider the following 
class of models 


Gp U + 


l T 


zg, 


fllZ 


R ~ m l gp V G\ l R 


= 87 tGT, 


/AV 


(4.3.9) 


10 With the correct normalization for the source. 

11 Of course here too we could use the Weyl tensor but only if we accept derivatives acting on curvature, i.e. 
— 


terms like a (□ 2 V P V CT W^pvcr) 
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4.3.2 Localization 

Localizing (4.3.9) involves both defining an auxiliary scalar ip to replace D -1 /? and invoking 
an auxiliary vector (p^ for the definition of the transverse part (4.1.7) (4.1.8). This gives 

G[iv + Zg^ v R — m s ~ V = Q'kGT^v , 

□<?V + V, y V /t <jV = 2ZV fl R- ^m 2 W ^ip, 

Uip = R. (4.3.10) 

The initial conditions of ip are determined by its definition 

ip = Ci~ 1 R, (4.3.11) 

and the initial conditions of cp^ are similarly determined by the ones of R. In the action-based 
case, the localized action allowed us to gain some insight into the dynamics of the auxiliary 
fields, i.e. to determine whether some fields are ghost-like or not. The above local equations of 
motion however do not derive from an action, so such features are less obvious to see here 12 . 
Nevertheless, one can still detect potentially pathological behaviour. For instance, the vector 
field (pfj, does not have the gauge-invariant kinetic term 

4.4 Solar system constrains 

4.4.1 No Vainshtein mechanism 

In local massive gravity, the vDVZ discontinuity is a discontinuity between the action and the 
propagator, i.e. the former reduces to its GR form in the m —> 0 limit, while the latter does not. 
As we discussed in the introduction however, continuity is restored in the non-linear theory 
through the Vainshtein mechanism. The strong-coupling scale goes like a negative power of m , 
so that linear perturbation theory breaks down as m —> 0, or equivalently at small scales, and 
thus the propagator does no longer reflect the forces that are present. 

In contrast, in all of the above non-local models, thanks to the trivial inversion properties 
of the linearized projectors, the tensor structure of the propagator (2.7.67) is the same as the 
tensor structure of the linearized action (4.2.1). Therefore, there is no discontinuity between 
action and propagator at any point of the parameter plane ( m g ,m s ) for all z / 0. Consequently, 
there is no need for a Vainshtein mechanism and the strong-coupling scale should be the Planck 
scale M. Let us have a closer look at this. 

The Vainshtein mechanism is a special case of a more general class of screening mechanisms 
known as “fe-mouflage” [108]. The latter can occur in scalar-tensor theories where the scalar 
couples non-minimally to gravity and has a non-linear kinetic term. The former property is 
what makes the scalar couple to the source of gravity, after diagonalization, while the latter 
property is the one responsible for screening it on short distances. Indeed, the higher-order 
terms in the kinetic term will necessarily involve a mass scale A <C M, which will correspond 

12 To see this, suppose such an action exists. Then, term ~ V( M (/>„) in the first equation, which would correspond 
to the equation of motion of g^, would be a total derivative in the hypothetical action. Thus, such 

“friction” terms cannot derive from an action. 
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to the scale of strong-coupling. Let us now follow the argumentation of [38] to see how this 
screens the scalar force on scales smaller than A -1 . 

In the case of a scalar-tensor theory a typical non-minimal coupling can be ~ cj)R. In the 
case of local massive gravity the scalar field is the Stiickelberg scalar, coupling also derivatively 
to gravity. After diagonalization, the metric to which matter couples becomes of the form 

V OiTjfxi/cj) = h^ii/ + 5fu /, (4.4.1) 

where a ~ 0(1) if h^ v and cj) are canonically normalized. We thus have <5^0 which corre¬ 
sponds to the difference in the gravitational force felt by matter, i.e. the “fifth force”. In the 
diagonalized theory, a typical example for the non-linear kinetic term that leads to /c-mouflage 
are the Galileon structures [25], such as 

(4.4.2) 

Note that the coupling of 4> to the energy-momentum tensor has the same strength as for the 
graviton because a ~ 0(1). The equations of motion then read (schematically) 

d 2 h + M~ 1 0(hdhdh) ~ M~ l T , (4.4.3) 

d 2 4> + A~ 3 0 (d 4 (/) 2 ) ~ M _1 T, (4.4.4) 

where the interaction term in (4.4.3) can always be neglected since we work at energies below the 
Planck scale. We now have the following asymptotic behaviours. At “large” scales d<fi -C A</>, 
the linear term dominates in the scalar equation so d 2 (f) ~ M~ 1 T ~ d 2 h and thus 6 0(1). 

At “small” scales d(j) Atp, but still dh <C Mh, 13 , it is the non-linear term that dominates, 
so <9 4 ()> 2 ~ A 3 M~ 1 T ~ A 3 d 2 h and thus (j) ~ y / A 3 h/d 2 , which means that now 6 is suppressed 
because A/d <C 1. Thus, the fifth force is screened. 

In the non-local models we consider here, we see that the localized equations of motion 
do not have such non-linear kinetic terms in the auxiliary sector. This is why no Vainshtein 
mechanism takes place and why the strong-coupling scale goes down to the Planck mass. From 
the theoretical point of view, the absence of the Vainshtein effect is nice because it implies 
that linear perturbation theory is valid at small scales and for arbitrarily values of the masses. 
In particular, for Z = 0, the solutions of the non-local models, computed as perturbative 
deformations of the ones of GR, will have an expansion parameter that is analytic in the 
masses. This feature has been verified for the spherically symmetric static solutions of the 
models with Z = £ = 0 and m g = 0 [74,75]. 

4.4.2 Constraints on Z 

From the phenomenological point of view, the absence of a Vainshtein mechanism implies that 
the forces that are present on small scales are the ones we read from the propagator (2.7.68) in 
the massless limit. We then have that the forces corresponding to the two scalar poles (on top 
of the massless graviton) cancel out only for Z = 0, while for 2 / 0 we have a net fifth force 
which spoils solar system tests. This could have been expected, because Z is a dimensionless 
parameter and the terms it controls are thus expected to deform GR at all scales, contrary to 
the terms ~ m 2 , which deform it at the scale m” 1 . 

13 So that we can neglect non-linearities for h. 
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It is however interesting to note that, by considering non-linear structures Z f(JZ~ 1 R) in 
the action-based model one can avoid this conclusion, as shown in [97] in the context of the 
Deser-Woodard model (3.3.1) [65]. The argumentation used in [97] is elegant and will allow us 
to understand better the effect of the ~ Z terms in our models, so we choose to reproduce it 
here with some minor adjustments. Let us work in D = 4 for simplicity. 

First note that homogeneity and isotropy imply that in cosmology the typical time-variation 
scale of the background is much larger than the gradients of the perturbations. Thus, as far as 
the action of □ on the Ricci scalar is concerned, the background dominates. In the standard 
cosmological history we have that R is always positive so 

- (d? + 3Hd t + £R)~ 1 R, (4.4.5) 

is always negative for £ > 0. Thus, only the region x < 0 of f(x) is relevant for cosmology. 

On the other hand, for solar system physics, the phenomena are non-relativistic and thus 
the gradients are much more important that the time-derivatives. This is why the standard 
theoretical tool for solving the Einstein equations in this case is the post-Newtonian expansion, 
where the small expansion parameter is v/c, with v being the typical velocity of the source. 
We then have that for non-relativistic systems T^ u is dominated by the mass in p = Too, so 
that the trace of the Einstein equation reads 


R ~ 8irGp > 0 . (4.4.6) 

For gravitationally bound systems we have the typical profile A ~ l p ~ +1 /r for the gravitational 
potential outside the sources. We thus have that (□ — £R) _1 i? « (A — ^R)^ 1 R is positive for 
£ = 0, but does not have a definite sign for £ > 0, a priori. We must therefore compare the 
A R and £i? 2 terms. By dimensional analysis we have that 

A R m 8irGAp m 8 ttGL~ 2 p , (4.4.7) 


where L is the typical size of the bound system. For non-relativistic systems the total mass 
M dominates the energy density p ~ M/L 3 and L is way larger than the corresponding 
Schwarzschild radius L 2 GM = R$. Thus, the ratio gives 


A R _ L 


(4.4.8) 


for 0(1) values of £, and we conclude that for solar system physics 

(n~£R)~ 1 R> 0. 


(4.4.9) 


So it is the x > 0 part of f(x) which affects the region of GR we do not want to mess with. 
One should therefore demand that f(x) ~ 0 for x > 0 in order not to spoil the solar system 
constraints, which implies in particular f'( 0) ~ 0. For our models, this means Z = 0. 

In the next chapter, we will see that the models where Z > 1/3, i.e. the ones where the 
scalar mode is healthy, actually do not even yield viable cosmological solutions. Thus, from 
now on we set Z = 0 and this implies z = 1 — d, so that the trace scalar is a ghost (see (2.7.68)). 
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4.4.3 The potentially viable models 

We now have a clear picture of which models may provide a viable phenomenology. From the 
previous section we know that Z = 0. This already brings the projector-based model (4.3.9) 
to the form 

Gfiu ~ {g^a^Rf = 8 t tGT^ , (4.4.10) 

where 

l \ 

m 2 = — m 2 = - m 2 , (4.4.11) 

\z\ d — 1 

is the mass of the scalar mode. Eq. (4.4.10) is a one-parameter extension of the model proposed 
by Maggiore [73] , corresponding to the case £ = 0, and we will therefore dub it the “£-M model”. 
The localized form reads 


m 2 g^ip - V = 8-kGT^ , 

(4.4.12) 



D^ + V^f - m 2 V ^ip , 

(4.4.13) 

Dip = R. 

(4.4.14) 


For the action-based Weyl model (4.2.19) we still have the possibility of considering massive 
tensor modes m g >0. In cosmology, this parameter will only affect the perturbations around 
the FLRW solution, since the background Weyl tensor vanishes. Since from now on we will 
focus exclusively on the background part of cosmology, we are effectively left with the m g = 0 
theory. Thus, the action-based model of interest reads 


S = M 2 


d xy/^g 


R- - — -m 2 Rn~ 2 R 

M 


(4.4.15) 


where we have again used the mass of the scalar mode m. This is a one-parameter extension 
of the model proposed by Maggiore and Mancarella [75], corresponding to the case £ = 0, so it 
makes sense to call (4.4.15) the “£-MM model”. The localized form of (4.4.15) is 


S 


d D xV^g 


M 2 R + M 



R — pdip 


d- 1 
4 d 


2 / 2 

m y> , 


(4.4.16) 


with cj) and ip obeying 

<!> = (□ - CR); 1 , i’ = M(a- SR); 1 r . 


(4.4.17) 


It is convenient to consider the dimensionless scalars (f> —> M<j> and ip —» Mip , so that the 
equations of motion read 


(G/j,u + 9^vG\ — V^V^) [1 + <p + £4>ip\ 


+V^(f)V u )ip 


1 

2 


9nvV P cpV p ip + 


d- 1 
8 d 


2 / 2 
m g^ip 


(□ - £R) <P 

(□-£W 


8irGT gi ,, 


d- 1 
2d 


m 2 ip , 


R. 


(4.4.18) 

(4.4.19) 

(4.4.20) 
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From (4.4.16) we see that part of the scalar terms induce an effective Planck mass 

M 2 (1 + </> + £00) R = Me S R, (4.4.21) 

which is not positive-definite. Therefore, gravity becomes unstable as soon as M 2 S < 0. 

The Maggiore and Maggiore - Mancarella models, which correspond to the case £ = 0, are 
currently receiving particular attention [71,76,78,104,119] because their phenomenology seems 
to privilege them among other non-local models that have been confronted with observations 
[65,68,70,73,74,97,107,109,110,120-124], Indeed, they have recently passed the constraints of a 
full Boltzmann/Monte Carlo Markov Chain analysis [78], of which they come out as statistically 
indistinguishable from ACDM, with respect to the current precision of the data. The Maggiore 
model actually even seems to be slightly privileged. 

The elegance of these models lies in the fact that they are very simple non-local modifications 
of GR with as many parameters as ACDM, i.e. the mass m plays the role of A. They are 
therefore very predictive since, once m is fixed such that it reproduces the observed amount of 
dark energy today, the rest of the physics is determined. It is therefore highly non-trivial that 
these models can compete with ACDM. 

Here we see that, after having narrowed down the set of models to the potentially viable 
ones, there remains a natural extension of the Maggiore and Maggiore - Mancarella models 
corresponding to □ —> □—£R. Considering one more parameter of course degrades predictivity, 
but it is nevertheless instructive to see what the effect of £ is. 

4.5 The effect of £ 

The effect of the £ parameter is very interesting because for 0 we have that 

{n-SR^Rxt-C 1 , if |£|»|(n r - 1 i?r 1 |. (4.5.1) 

Thus, as soon as R 0 0, the dynamics of these models should be indistinguishable from GR 
with a cosmological constant A ~ m 2 , for large enough £. If on the other hand R = 0, which 
is the case during RD for the cosmological background for instance, then of course = 0 

by linearity. 

Not surprisingly, a first effect of £ > 0 is the existence of de-Sitter solutions G fll/ + Rg jiv = 0. 
Assuming a constant R, we have 

d — 1 m 2 1 

A = ~8d^^ , </> = -!, ^ = ( 4 - 5 - 2 ) 

for the action-based model, and 

A=C ^ lr ^, 0/* = O, '0 = -^, (4-5.3) 

for the projector-based one. 

No degravitation 

One of the original motivations for considering non-locality was not only to produce a dark 
energy effect, but also to degravitate any constant source. The very existence of de-Sitter 
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solutions for £ / 0 implies that these sources are not excluded, but the effective A could still be 
different from the one that would appear in the equations of motion. Unfortunately, this is not 
the case. Indeed, for both models, adding a vacuum energy term simply rescales A —>■ A +A vac , 
so it is not degravitated at all. Note that this argument does not encompass the £ = 0 models, 
nor the possibility of a dynamical degravitation mechanism, i.e. a time-dependent degravitation 
in the cosmological context. As we will see however in the next chapter, no such effect will take 
place. 
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Chapter 5 

Cosmology 


Here we work in D = 4 and consider the background cosmology of the action-based £-MM 
model (4.4.15) and the projector-based £-M model (4.4.10). This chapter is based on, and 
extends, [70,71]. 


5.1 Background equations 

We now consider a flat (k = 0) FLRW metric in cosmic time 

g liU dx fJ, dx u = —d t 2 + a 2 (t) dx 2 , (5.1.1) 

so that all fields depend exclusively on time. We will use x = log a as the time coordinate and 
denote by a prime the derivative with respect to x, so that 

0 = H<\> . (5.1.2) 

The case k ^ 0 is also interesting, but we will not consider it both for simplicity and because 
k = 0 is consistent with the present data. 


5.1.1 Action-based model 


For the equation of goo in (4.4.18) we get the modified Friedmann equation 

H 2 = a (p + pde) (5.1.3) 

where p = Too and 

1 + 0 + 4>' + £ {(jyij) + (0-0)0 + g 0 / V’ / ’ PDE 967 tG ^ ^ ^ 

We see that with this rearrangement the system has turned into a Friedmann equation with 
a time-dependent Newton’s constant and a dynamical dark energy component induced by the 
mass. Another “effective Newton’s constant” is (4.4.21) which appears in the localized action 
(4.4.16) and must be monitored since its sign is the one of the kinetic term of gravity. We 
therefore also define another parameter 


1 

1 + 0 + £00 ’ 


(5.1.5) 
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We now go to dimensionless variables 




„ 8nG 
P= 3H 2pl 


8irG 1 9,9 

PDE = W779 PDE = t P V , 
4 


/i 2 = 


m 

9fl?’ 


(5.1.6) 


where the 0 subscripts denote evaluation at today xq = 0 , so the system of equations is 


h 2 = a{p + Pde) , (5.1.7) 

0" + (3 + C) </>' + 6£(2 + C)(/> = 3 p 2 h~ 2 iP, (5.1.8) 

< + (3 + C)^ + 6e(2 + C)V» = -6(2 + 0, (5.1.9) 

where for p we consider a fluid made of matter and radiation 

P = P°R e 4x + pit e 3x , (5.1.10) 


and 

h 7 *_ 1 h~ 2 p' - 3 p 2 h~ 2 ip (1 + £0) + 40 (1 + 0/0 + (1 - 20 + 4 £</> (6 + 6£0 + fi) 

^ = ~h ~ 2 1 + ( 1-60 ( 1 + 0/00 ' 

(5.1.11) 

where in * we have used the equations of motion to get rid of the second time-derivatives. 
Given (4.4.17), the initial conditions of 4> and i/j are zero at the initial time ti if the latter is 
well-inside the RD era since Rrd = 0 


fiti) = fi{ti) = ip(ti ) = fiiti) = 0. 


(5.1.12) 


Now, had we chosen to include the ~ Z term of (4.2.19), the denominator of Q would have 
rather been 


RD 


2 ( 1 - 3 Z) + (1 - 60 {Zi/; + 20 (1 + 0 / 0 ) 2 ( 1 - 3 Z) ' 


(5.1.13) 


In the case Z > 1/3 which corresponds to z > 0, and thus to the case where the scalar mode is 
healthy, we have that £ has the opposite sign and thus H is growing. Thus, on top of spoiling 
solar system physics, the Z parameter also spoils the cosmological background solution in the 
region where it is interesting to consider, i.e. where it makes the scalar healthy. 

Also, observe that the a factor appears in front of all the energy components, i.e. had we 
added a “vacuum” cosmological constant in the action we would have simply obtained 


P + 7*de : y P + 7>de + Pv ac ■ 


(5.1.14) 


From here it is clear that no degravitation of p vac can be achieved without also degravitating 
matter and radiation as well. Moreover, what we will observe in the simulations is a > 1 at 
late times, so we will have an enhancement of the source rather than a screening effect. Thus, 
even in the dynamical context, no degravitation mechanism appears. 

Finally, we would like to spot the variables which characterize conveniently the departure 
from GR. In ACDM one has that the equation of state parameter of the source can be expressed 
in terms of H. Indeed, one uses the barotropic equation of state p = wp and the conservation 
of energy 


p = -3H {p + p) = -3H (1 + w) p , => d x \ogp = -3 (1 + w) , (5.1.15) 
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(5.1.16) 


and replaces p using the Friedmann equation p r\j H 2 to find 

2 _ 

In our case, this quantity represents the equation of state of the effective source seen by H, 
namely p e s = a (p + pde)- For RD ( w = 1/3), MD ( w = 0) and de-Sitter (w = —1) phases we 
get 

Crd = -2, Cmd = -^, Cds = 0 . (5.1.17) 

It will also be interesting to have the equation of state corresponding to pde> which we define 
through the “conservation equation” of this effective source (5.1.15) 

WDE = -1 - \ Sxlog/ODE = -1 - \ • (5.1.18) 

3 3 tp 


5.1.2 Projector-based model 

Homogeneity and isotropy imply that only the 4>o component of the auxiliary vector is non-zero. 
It is then convenient to trade it for a new variable (which is not a scalar) 

(5.1.19) 

m z 

whose equation of motion can be found by taking the time-derivative of the p = 0 part of 
(4.4.13) and using (4.4.14). Given the definition of ■0 (4.3.11) and the fact that only retarded 
Green’s functions are invoked in the definition of the transverse part, both '0 and cj )o have 
vanishing initial conditions 

<j>o(U) = </>o(t*) = ^(.U) = = 0, (5.1.20) 

for ti well-inside the RD phase. Observe that this only implies <f>(ti ) = 0. To get the condition 
on one must evaluate the second-order equation of </>o, he. the p = 0 part of (4.4.13), at 

U, to get that </>o(ti) = 0 as well and thus 

< f>(U ) = <j)'(ti) = if)(ti) = ) = 0 . (5.1.21) 


Now the modified Friedmann equation, i.e. the pu = 00 part of (4.4.12), takes again the form 
(5.1.3) with 

c = 1, Pde= 24(5.1.22) 
Had we chosen to consider the ~ Z term of (4.3.9) we would have rather found 


1 

1-2Z(2 + C) ' 


(5.1.23) 


As in the action-based case, here too the Z > 1/3 choice would be problematic. Indeed, in RD 
we have that a = 1 since Crd = — 2. But if we are supposed to reach a DE phase at late times, 
i.e. w « — 1, then by (5.1.16) we have Cde ~ 0 and thus 


a 


1 

1 -4Z ‘ 


(5.1.24) 
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For Z > 1/3 this is negative, so at some point between the two phases a~ 1 must go through 
zero, which means that space-time has a singularity H —> oo before today. As a consequence, 
the choice of a healthy scalar mode Z > 1/3 spoils the background evolution for the projector- 
based models as well. Note that for Z = 0, which is the case of interest, a = 1 so that there is 
no dynamical degravitation. 

Defining again the dimensionless variables (5.1.6) but now with 


87 tG o 

PDE = ^2 P DE = M V > 


the system of equations becomes 


h 2 

4 >" + (3 — C) ^-3(1 + 00 
ip” + (3 + C) i>' + 6£ (2 + c )ip 


P + Pde , 

—3ip' + 3 (1 + ()ip, 

-6(2 + C) , 


(5.1.25) 


(5.1.26) 

(5.1.27) 


and 


h! 1 p' + p 2 (p 
h 2 p + p 2 (p 


Note that ip has exactly the same equation as in the action-based model (5.1.9), i.e. it is 
the field which localizes Cl]— 1 /?. Finally, we can again define wde through the “conservation 
equation” of pde to get 


_ 1 1 n 1 1 # 

■i^DE = -1 — X O x log PDE = -1 — X — 
3 6 <p 


(5.1.29) 


5.2 Numerical analysis 

Set-up 

According to the Planck data [ 8 ], which assume ACDM, we have fP R = 9.21 x 10 -5 and p° M = 
0.3175. Since our solutions will be close to ACDM up until today, we will choose these values as 
well 1 . The matter-radiation equality then occurs at x eq ~ —8.15, with today being xq = 0. We 
will start our numerical integration at x = —40, that is, well-inside the RD era, so that we can 
safely impose zero initial conditions on cp and ip for both the action-based and projector-based 
models. 

Note that consistency requires ho = 1, so here this is achieved by tunning p? appropri¬ 
ately. This is analogous to the case of ACDM where one of the energy density components is 
determined by the defining condition )>+ D? = 1. Here however we do not have the data that 
determine p 2 algebraically, since they include the field values today and we only control the 
initial conditions. Therefore, p 2 will be determined by successive trials and we will stop when 
log ho = 0(1CD 6 ). The resulting value will depend on £, the second parameter of the model, so 
imposing ho = 1 actually fixes the relation p 2 (P,). 

1 For the £ = 0 models a full parameter estimation using CMB, BAO and SNe data has been presented in [78] 

and the values chosen here are consistent with their results. Since the £ > 0 lie somewhere between the £ = 0 

ones and ACDM, these values should be alright for them too. 
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Data description 

So let us now describe our results that are collectively displayed in the plots and tables of 
section 5.2.1. We have computed the cases £ = 2 n , where 

n = -oo,-6,-5,-4, -3, -2,-1,0,1,2. (5.2.1) 

In the plots the color goes from blue to red with increasing n, while the ACDM result is given in 
green for comparison. In figure 5.2 we have plotted the quantity log (/i//iacdm) j where /iacdm 
is the dimensionless Hubble parameter of ACDM, normalized to 1 today. In figure 5.3 we have 
plotted the effective equation of state parameter w (5.1.16), but since the results overlap too 
much at x = 0 we have also plotted the difference with ACDM in figure 5.4 to get a cleaner 
picture. In figure 5.5 we have potted today’s values of w with respect to £. In figures 5.6 and 
5.7 we have plotted the effective dark energy component />de and the corresponding equation 
of state tude, respectively. In figures 5.8 and 5.9 we have (j> and ip, where we must stress that 
the former is a different non-local functional of R in each model. Moreover, in the action-based 
model it is ip that controls the dark energy component pdE) while in the projector-based model 
it is (p. In figure 5.10 we have plotted the a and a quantities of the action-based model which 
correspond to the (dimensionless) effective Newton’s constant (5.1.4) in the modified Friedman 
equation (5.1.3) and the effective Newton’s constant (5.1.5) in the localized action (4.4.16), 
respectively. Finally, in table 5.1 we have given the numerical values of p?, wq and u>de,o- 

Analysis 

A first general remark is that, by increasing £ we get arbitrarily close to ACDM, as anticipated 
in section 4.5. More precisely, note how, as £ increases, the dark energy component />de tends 
to behave more and more like a cosmological constant, both in the future and past around 
x = 0, 2 (figure 5.6), while the effective Newton’s constants of the action-based model a and a 
tend towards one (figure 5.10). 

Thus, for large enough £, one should find the /r 2 (£) relation of the de-Sitter solutions (4.5.2) 
and (4.5.3) with the A of ACDM, i.e. p\ = 1 — fP R — p° M « 0.6824. More precisely, defining 

= (5.2.2) 

we have that (4.5.2) and (4.5.3) give 

p 2 = 4p A e, M 2 = Pa£, (5.2.3) 

respectively. In figure 5.1 this relation corresponds to the green line and we see that the dots 
follow that trend indeed, for already small £ values. For very small £ we have that the transition 
to the dS phase is not complete yet at x = 0 and thus (5.2.3) does not hold. 

For the action-based model we have that the de-Sitter solution in the £ > 0 case is an 
attractor, since the universe reaches that state asymptotically (see figures 5.3 and 5.6). The 
acceleration is faster than in ACDM (see figure 5.2), but one tends towards F/acdm with 
increasing £. For the projector-based model that solution is unstable and the universe is rather 
attracted towards a w = —1/3 phase after the de-Sitter one, for all values of £. Increasing 

Although it is forced to be zero during RD. 
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Action-based 


Projector-based 




Figure 5.1: The /x 2 (£) relation which gives ho = 1 (red dots) along with an interpolation (blue 
line) and the de-Sitter solution relation (5.2.3) (green line). 


£ however makes the de-Sitter phase last longer (see figures 5.2, 5.6 and 5.7), as could be 
expected by the fact that in the £ —> oo limit one recovers ACDM. A w = —1/3 value is 
interesting since it implies zero acceleration a = 0, and therefore a ~ t. Thus, although the 
dark energy component tends to zero as t —> oo, it dominates over matter at late times. 

Another noteworthy feature is that the observable departure from GR (figure 5.2, 5.3 and 
5.4) starts roughly around today, i.e. when the curvature ~ H 2 approaches the m 2 scale. On 
the other hand, the dark energy component />de starts being non-zero as we enter the MD era, 
i.e. roughly around x eq ~ —8, since this is when R “wakes-up”. 

Finally, the fact that the dark energy component starts from zero and then grows, i.e. 
Pde > 0 and />de > 0 at the beginning, implies that wde starts below —1 because of (5.1.15). 
Thus, non-local dark energy models have this in common that their equation of state starts on 
the phantom side. 


109 




WACDM 


5.2.1 Plots &; tables 
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Figure 5.2: The logarithmic departure from the Hubble parameter of ACDM. 
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Figure 5.3: The effective equation of state parameter w. 
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Figure 5.4: Departure from the equation of state parameter of ACDM. 


110 





































WDE PdE 


- 0.66 


Action-based 


Projector-based 


- 0.68 

- 0.70 

- 0.72 

- 0.74 

- 0.76 

- 0.78 

- 0.80 


. 

- 0.67 


. 


- 0.68 




- 0.69 




£ 




- 0.70 




- 0.71 



. 

0.72 

. 

. 


Figure 5.5: The effective equation of state parameter today wq (red dots) with the ACDM 
result (blue line). 
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Figure 5.6: The dimensionless effective dark energy component pee- 
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Figure 5.7: The dark energy effective equation of state parameter u>de- 
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Figure 5.8: The dimensionless localizing field <f>. 
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Figure 5.9: The dimensionless localizing scalar j/j. 
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Figure 5.10: The effective Newton’s constants a and a of the action-based model. 
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Action-based 

P ro j ector-based 

l°g 2 £ 

A* a 

wo 

^’DE,0 


wo 


—oo 

0.0089235 

-0.7816 

-1.1307 

0.050252 

-0.7108 

-1.0417 

-6 

0.0113795 

-0.7664 

-1.1144 

0.055373 

-0.7069 

-1.0359 

-5 

0.0144145 

-0.7528 

-1.0992 

0.060895 

-0.7033 

-1.0306 

-4 

0.022624 

-0.7300 

-1.0720 

0.073170 

-0.6968 

-1.0212 

-3 

0.050545 

-0.7005 

-1.0314 

0.10260 

-0.6874 

-1.0074 

-2 

0.17022 

-0.6824 

-0.9994 

0.17733 

-0.6804 

-0.9971 

-1 

0.68365 

-0.6829 

-1.0011 

0.34773 

-0.6819 

-0.9993 

0 

2.7328 

-0.6822 

-1.0007 

0.69005 

-0.6820 

-0.9994 

1 

10.9275 

-0.6835 

-1.0004 

1.3739 

-0.6822 

-0.9997 

2 

43.687 

-0.6812 

-1.0001 

2.7408 

-0.6821 

-0.9996 

ACDM 

- 

-0.6824 

-1 

- 

-0.6824 

-1 


Table 5.1: The values of the mass parameter and today’s effective equation of state parameters. 


5.3 Analytic approximations 


Now that we have some concrete insight into the physics, let us try to reproduce the essence of 
the numerical results through analytic approximations. The equations of motion can be solved 
analytically if we assume that w, or alternatively £, is constant, which is the case when we are 
well-inside a definite phase of the universe’s history (5.1.17). Here we know that the solutions 
admit such plateau values (see figure 5.3), but even if we did not, we could assume they exist 
and check the consistency of the solutions afterwards. 

We start by solving for if> (5.1.9), which obeys the same equation in both models. For £ = 0 
we get 

= —6 ^ + x + ai + a 2 exp [- (3 + £) x\ , (5.3.1) 

while for £ ^ 0 we get 




1 

£ 


+ aiexp 


+a 2 exp 



— ^(s + C- V /(3 + C) 2 -24£(2 + o) 
^(3 + C+ V /, (3 + 0 2 - 24£(2 + o) • 


(5.3.2) 


These have the same form only in RD where £ = — 2 


-0 = C + a 2 e x . (5.3.3) 

For more general £, the exponentials are decaying if £ > —3 (and thus w < 1), which is the case 
in all phases of interest (5.1.17), so these solutions are stable. In the £ = 0 case we then have a 
linear evolution, while in the £ ^ 0 case we have an attractor behaviour towards —1/£. This is 
confirmed in figure 5.9, although the convergence is quite slow for low £. In RD, which is where 
we begin, the integration constants are fixed by the choice of initial conditions. Remember 
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that these are theory-level data, i.e. different choices correspond to different definitions of □ 1 
and thus to different theories. Here the data (5.1.12) translate into C = = 0, thus giving 3 

V’RD = 0 . (5.3.5) 

For £ 0, we have that during the MD and DE phases = —1/£ because of the attractor 

behaviour (5.3.2). So let us focus on £ = 0 where the solution takes the form (5.3.1). In the 
simplest approximation, the beginning of the MD phase £ = —3/2 occurs at matter-radiation 
equality x eq = log p° R /p° M , so this is where ij; should start being non-zero. This gives 

V'md ~ -2 (x - x eq ) , x > x eq . (5.3.6) 

Then, considering x = 0 as the transition from MD to de-Sitter £ ~ 0, and matching with the 
above result, we get that 

Ip DE ~ -2 (2x - Xeq) , X > 0 . (5.3.7) 

Indeed, in figure 5.9 we see that the slope increases (from 2 to 4) after MD and as a further 
check we can verify that ?/>(0) ~ 2x eq ~ —16 seems correct. In the projector-based case, the 
slope then decreases again in the future because we pass from the quasi-de-Sitter phase £ = 0 
to the ultimate w = — 1/3 phase, giving £ = —1, and thus a slope of 3. 

Let us now look at each model separately. 

5.3.1 Action-based model 

We wish to solve (5.1.8) for £> by analytic approximations. To do so, we first need to solve for 
h with a constant £ 

ti = (h , => (5.3.8) 

Then, we start by computing the solution for RD where t/pj) = 0 to get 

4> = b\ + b2e~ x —> b \, (5.3.9) 

whatever the value of £, so this result is stable. With vanishing initial conditions we have 

<t> rd = 0. (5.3.10) 

For the subsequent phases we must consider the £ > 0 and £ = 0 cases separately. 

3 Considering a non-zero integration constant in RD corresponds to a different theory, namely, the one where 
the inversion of D is affine, i.e. it is of the form ip ~ / + Dy 1 !?, where / is a homogeneous solution □/ = 0. 
This extension has been studied in [70] for the projector-based model with £ = 0. Since / is made of a constant 
part and a decaying exponential, the non-trivial part of the modification is f = const and this simply amounts 
to adding an m-dependent cosmological constant in the equation. Indeed, since is trivially transverse, we 
have 

m 2 (□“ 1 R) T = m 2 g^f + m 2 (□ r _1 R) T . (5.3.4) 

The effect on cosmology is similar to the one of £, as it bridges the Maggiore model with ACDM. 
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The case £ > 0 

Using (5.3.8) and £> = —1/£ the equation of £> gives 


(5.3.11) 


6 ~__ e - 2 ? x + horn 

0 2£((C-3)C + 3£(2 + C)) + ' 

where the homogeneous part is the same as for if) (5.3.2) since their equations differ only through 
their sources. Therefore, the homogeneous solutions of 4> are stable as well. We can thus focus 
on the inhomogeneous part which is diverging for MD £ = —3/2. Indeed, the £> profile in the 
interval x < 0 of figure 5.8 is exactly the one of an exponential with a negative 0(1) factor 
in front. In the de-Sitter phase however, the solution is attracted towards a constant. The 
de-Sitter solutions are known exactly (4.5.2) and coincide with the observed value of —1. With 
this behaviour for q i>, and if) = —1/£, we have that a (5.1.4) is also constant at late times and 
thus so is H 2 . 


The case £ = 0 

To get the MD solution here we have to use (5.3.6) 

4> ~ -37 ^ ( 9 ( x “ Xe q) “ 5 ) e3a: + b i + b 2 e~^ x , (5.3.12) 

ol 

which is again unstable and fits with figure 5.8. In the £ = 0 case we have to use the (5.3.7) 
solution to get 

0 ~ - /x 2 (3x — 2) x + b\ + b' 2 e~ 3x . (5.3.13) 

o 

Surprisingly, this is not at all the kind of behaviour we observe since 4> is constant at late times. 
This implies that the assumption £ = 0 is not valid, i.e. £ tends towards zero as x — > 00 but 
too slowly. We therefore need a more precise ansatz for ( and we thus proceed perturbatively 
from infinity. Using the leading order solutions 


■0DE ~ -4x , 


0DE = “I , 


(5.3.14) 


we have that (5.1.7) and (5.1.11) give 


h 2 


4 ju 2 x 

’ 


6fi 2 h 2 

l + (t> 


(5.3.15) 


and thus imply 

«"i- (53 ' 16) 

As a check, in the left panel of figure 5.11 we have plotted ip/x and x( to see that they tend 
indeed towards —4 and 3/2, respectively. We can then solve h' = Qh to find h ~ x 3//2 . Now 
that we have the more precise profiles h 2 ~ x 3 and £ ~ S/2x for large x, we can plug them 
in the equation of <f> and solve. The result is a combination of a Meijer G-function, an error 
function and a decaying exponential, whose x —> 00 limit is an integration constant, consistent 
with the numerical result 5.8. 

A growing Hubble parameter at late times is more violent than the constantly accelerated 
expansion of a de-Sitter phase, so let us see what it implies for the fate of the universe. 
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Figure 5.11: The functions tfr/x and x( in the action-based £ = 0 model tending towards the 
values —4 and 3/2, respectively. 


Big rip singularity 

We have H = (2/T)x 3 / 2 , for some positive constant T with dimensions of time. To estimate 
the latter, we try to guess the asymptotic value of x~ 3 / 2 /i by going at large x and find a good 
estimate in x _3 / 2 /i —> 0.09, so we have that T ~ 22 Hq 1 . The equation for a(t) is then 


a = Ha = — (logo) 3/2 a, 


(5.3.17) 


whose solution is 


a(t) = exp 


T 2 

(trip - tf 


(5.3.18) 


This is an example of the so-called “big rip” singularity, i.e. the divergence of the scale factor 
and the Hubble parameter at finite time 


lim a(t) = oo , 
t-¥t~ 


lim H(t ) = oo 
t-¥t~ 


(5.3.19) 


In our case this occurs far in the future since T corresponds to several times the age of the 
universe. Moreover, we must not forget that, since H is growing in the DE, the curvature R 
will eventually reach an energy scale where this effective description ceases to be valid, so the 
region close to the singularity cannot be trusted. 

It turns out that a big rip is a usual consequence of phantom equation of state parameters 
u>de < — 1. Indeed, the phenomenology of such types of dark energy was first considered 
in [125, 126] 4 where it was realized that w < — 1 in GR would generically imply a future 
singularity at a finite time (5.3.19). For constant w this is easy to show. The continuity and 
first Friedmann equations read 


p + 3 H (1 + w) p = 0 , 


a = a 


8 v tG 


P- 


(5.3.20) 


4 For a study of the type of future singularities caused by phantom dark energy see [127] and for the case 
where this occurs with the Deser-Woodard type of non-locality see [112], 
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(5.3.21) 


The first gives p = poa 3 ( 1+1 ") and, plugging this in the second, we get 


a = H$a 


-§(1+U)) + 1 


The solution can be written as 

ci(t) 



H 0 {1 + w) (trip 


t) 


2 

3 ( 1 +™) 


(5.3.22) 


where f r ; p is the integration constant. Since 1 + w < 0, the bracket is positive, while the power 
is negative and we thus have (5.3.19) indeed. In our case we have that w < —1, but tends 
towards —1 as time passes. Thus, whether there will be a big rip or not depends on how fast 
this convergence is. We now know that for £ > 0 there is no big rip, but rather an eternal 
de-Sitter phase, while for £ = 0 no de-Sitter solution exists and we have a big rip. This feature 
can be traced back to the discontinuity of the asymptotic behaviour of 0 as £ —> 0. For £ > 0 
we have that 0 tends to the constant value —1/£, while for £ = 0 it goes like ~ — Ax. 


5.3.2 The projector-based model 

We now focus on (5.1.26) which does not depend explicitly on £, although 0 does. In the RD 
phase we have 0 rd = 0 so 0 has only a homogeneous solution, which is decaying, and thus the 
0 rd = 0 solution is stable. We then enter MD, where the choice of £ is relevant. 


The case £ > 0 

With 0 = —1/£ we can solve (5.1.26) 


1 , 

= - + 0i exp 


1 

— x 

2 


(3 - £ - \/21 + C (6 + 


+ b 2 exp 


1 

— x 

2 


^3 — £ + v^TToeT 


(5.3.23) 

In MD £ = —3/2 the exponentials decay and we are attracted towards the constant solution 
l/£ as can be checked in figure 5.8. In de-Sitter £ = 0 however we have a diverging mode 
~ exp ((-021 — 3) x/2) in the homogeneous solution, so this phase is unstable. This leads us 
to the final stage of the universe’s history which is a £ = —1 phase {w = —1/3), in which case 


4> — b' 1 + b' 2 e Ax , 

so this phase is stable. From figure 5.8 we see that 0 = 0. 


(5.3.24) 


The case £ = 0 

To get the MD solution here we have to use (5.3.6) to get 

4> = — 2 + 2 (x — x eq ) + horn., (5.3.25) 

where the homogeneous part decays. This is indeed what we observe in 5.8, i.e. a linear trend 
which cuts the x = 0 axis at approximately 0(0) « —2(1 + x eq ) ~ 14. Then, for £ = 0, using 
(5.3.7) we get again the same kind of diverging mode in the homogeneous solution as in the 
£ > 0 case. We must thus finally consider the case £ = —1, and 0 ~ — 3x, where the solution is 

0 = ^ x + b\ + b 2 e~ Ax . (5.3.26) 
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As in the £ = 0 action-based model, this final trend is not at all the behaviour we observe, 
which means that £ does not tend fast enough to —1. Here however we have a simpler way to 
deduce at large x. Indeed, since here a = 1, we have that w = u>de at late times so we can 
use (5.1.29) and (5.1.16) to get 



(5.3.27) 


Then, using the lowest order result £ = — 1 the above equation gives <f> ~ e -1 ' 2 , which is indeed 
the behaviour we observe 5.8. Note that this technique would not have worked in the action- 
based model because there a 1 in the future (see figure 5.10). Indeed, had we used w = wbe 
and (5.1.18) and (5.1.16), we would have rather found £ = 1/x instead of 3/2x. Thus, w and 
u>de tend to the same value but at different paces. 


5.4 Stability 

As we have already argued in section 4.2.6, the diverging modes of a non-tachyonic ghost, which 
is what we have here, should manifest themselves at time scales of the order of the mass scale. 
This implies that the background solutions we have studied above are potentially unstable 
under linear perturbations, but this does not necessarily spoil the viability of the cosmological 
history. Indeed, since m ~ Hq the typical time interval for the divergence to become notable 
is of the order of the age of the universe At ~ m -1 ~ Hq 1 . 

The linear perturbations of the £ = 0 models have been studied in [76], where it was 
shown that there are indeed no notable divergences up until today. As already mentioned, 
these models have even been studied with a full Boltzmann/MCMC code and found to be 
statistically equivalent to ACDM [78], with respect to the present data precision. We know 
that with large enough values of £ we can approach GR with a cosmological constant with 
arbitrary precision. At the level of the cosmological background evolution, we have verified 
indeed that £ interpolates between the £ = 0 models and ACDM. There is therefore no reason 
why this should not be the case in general, and we thus we expect the £ extensions to be equally 
viable at the level of the perturbations as well. 

An interesting fact regarding the perturbations is that in the action-based model they are 
actually even bounded. The perturbations of the two auxiliary scalar modes are given in figure 5 
5.12. We have plotted several different values of comoving wave-number k = k/k e q , where 
k e q = a eq H eq is the comoving wave-number corresponding to the horizon scale at matter- 
radiation equality 6 . Since k eq ~ 42i7o> we have that the displayed choices of n range from 
sub-horizon to super-horizon modes today and all of them tend to a constant for large x. 
Incidentally, the same holds with respect to cosmic time t and, in particular, they are smooth 
in the t —> t~ ip limit. We see that the large wave-length modes tend to diverge, as expected, soon 
after x = 0, but are then quickly tamed towards a constant evolution. We can now understand 
this as the consequence of Hubble friction. Indeed, if H admits a singularity at finite time, the 
big rip, then by continuity the Hubble friction term ~ H<f> in the equations of the scalars will 
inevitably dominate at some point over any other term, i.e. even over the tendency of ghost 
modes to diverge 1 . 

5 Courtesy of Yves Dirian. 

6 For the numerical integration the set-up of [76] has been used. 

'In [76], instead of focusing on the scalar modes themselves, the authors have chosen to treat the deviation 
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Figure 5.12: The linear perturbations of U = —tp and V = —fi~ 2 /3(p as a function of x for the 
modes k = 5 X 10 -3 (blue), tt = 5 x 10~ 2 (purple), k = 5x 10 _1 (brown), k = 5 (green) in the 
MM model. 


One could still be worried by the small window where 5(p, Sip grow significantly around 
x = 0, especially in the case of large scales where the effect is the strongest. However, as shown 
in [76], this has no notable effect in the evolution of observable quantities such as the dark 
matter energy density or the Bardeen potentials. 

Finally, note that the wde,o values found here, which range between —1.13 and —1, are 
consistent with the present observational data [78], but nevertheless give different predictions 
than ACDM, 8 . Future missions such as the Dark Energy Survey [129] and EUCLID [130] are 
expected to measure u>de,o with a percent precision and will thus allow to discriminate these 
models from ACDM. The £-parametrization we proposed, which is an original feature of the 
present thesis, allows more flexibility for matching the desired value, since it covers all values 
of wde,o from the one of the MM model u>de,o ~ —1-13 up to the one of ACDM u’de,o = — 1. 
Of course this lowers the predictive power of the model, but we see that the predictions remain 
quite sharp. 


from GR as an effective dark energy fluid and thus focused on the effective quantities pde,Pde,#de and ode 
that are the energy density, pressure, velocity and anisotropic stress scalars, respectively. The conservation 
equation then leads to the evolution equation for the contrast <5de = <5/Ode/pde which is eq. (6.9) of [76]. In 
this description, the “wrong” relative sign appears in the fact that the sound speed squared (? s is negative at all 
times, as shown in figure 16 of [76]. The fact that <5de tends to zero in the future (figure 14 of [76]) had already 
made the authors of [76] deduce that the Hubble friction dominated the dynamics. 

s Note that, if one assumes a constant iude for the background, then the present data narrow the result down 
to iude = —1.00±0.05, [128], which therefore excludes part of the models we have considered here. However, in 
these models icde is not constant and a full comparison with the data has proved their viability, even if wde.o 
can go down to —1.13. 
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Chapter 6 

Conclusions 


In this thesis we have elaborated on the formulation, properties and phenomenology of some 
non-local theories of gravitation containing a fixed mass parameter, with the ultimate aim 
being of providing a viable dark energy model. 


Linear massive gauge theories 

We have started our investigation by trying to understand, under several viewpoints, the prop¬ 
erties of linear massive gauge theories in order to prepare the ground for their non-local formu¬ 
lations and generalizations. We have found that performing ad + l harmonic decomposition of 
the fields, whether in the Lagrangian or Hamiltonian formalisms, provides a very transparent 
understanding of the dynamical content of these theories. In particular, this decomposition 
reveals the structure of the spin-2 theory with generic mass term. Once the non-dynamical 
fields have been integrated-out, the d-scalar sector (2.4.50), which is the interesting one, can 
be neatly represented by two fields, one of which is a ghost 


fiscal. — 


d — 1 


d D x 


1 


1 


-- m 2 & 2 — 4> (p — Act) 


( 6 . 0 . 1 ) 


+ X(-d u Gd>'G- l -m 2 


vrr 


ghost 


G 2 - 


m 

d- 1 


G(p- dp) 


with 


m 


2 _ 
ghost — 


1 + d (1 + 1/ct) 
d^l 


m 


( 6 . 0 . 2 ) 


where 4> = 4> + m~ 2 G and both 4? and G are analytic in m. From this the dependence of the 
physics on the (m 2 ,a) parameters is clear. The Fierz-Pauli point a = 0 is the only ghost-free 
theory, but it is also the only one which is discontinuous in the m —> 0 limit, since 4> survives 
in the action. Remarkably, 4> is a gauge-invariant combination, under the gauge symmetry of 
the massless theory. This implies that, although the massive action is not gauge-invariant, the 
physics is invariant under a 2-parameter subset of gauge transformations, so that this could be 
called a “hidden” symmetry. 

Moreover, this property is preserved on a de-Sitter background as well, but then the d- 
scalar field 12 is a combination of the h^ fields that is non-local in time, and actually quite 
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ugly (2.4.97). Again, integrating-out the non-dynamical fields, the d-scalar action reads (2.4.96) 


fiscal. = d —±^U D xV=~g 

a m z I 


-~d u nd p n--m 2 n 2 

2 M 2 


(6.0.3) 


where M 2 = m 2 — (d — l)i7 2 . This reflects quite elegantly the dependence of the spectrum 
on the mass m on a de-Sitter background, with the special case M 2 = 0 corresponding to the 
so-called “partially massless theory”. 

We have then moved on to the computation of the propagators of each theory and have 
discussed the Stiickelberg formalism. Both approaches show how the apparent discontinuity 
in the degrees of freedom as m —> 0 can be understood as the smooth decoupling of some 
modes. Using the Stiickelberg trick, we were able to reformulate the equations of motion 
in a gauge-invariant way, even in the presence of a mass, with the price to pay being the 
loss of locality. Nevertheless, locality is restored with the appropriate choice of gauge, which 
leads us to interpret the mass term as an obstruction to having both gauge-invariant and 
local representations of the theory. In the spin-2 case, we have also found that the non-local 
formulation of Fierz-Pauli theory actually has one more gauge symmetry than GR itself! This 
is the symmetry of linearized conformal transformations which is responsible for killing the 
ghost mode in this context. 

A useful by-product of this construction are the transverse projectors V , that is, non¬ 
local operators which make the gauge-field transverse and gauge-invariant and thus allow a 
straightforward construction of massive gauge-invariant theories. In the spin-1 case, only one 
such projector exists and the only gauge-invariant quadratic theory one can construct is nothing 
but the non-local formulation of the Proca action of massive electrodynamics. In the spin-2 
case however, because the subspace of transverse tensors splits into traceless and pure-trace 
parts, there are two projectors and thus one has access to more models than the ones that 
are equivalent to the local theory. These are therefore genuinely non-local, i.e. they are non¬ 
local whatever the gauge we choose. We have thus considered these models, and in particular 
(2.7.62) 

(□ - m 2 ) oP^u P ah pcT + (zD - m 2 ) s Vp V pah pa = —Tp U , (6.0.4) 

which, on top of a massive graviton, contains an extra propagating scalar mode corresponding 
to the trace h. In the local theory, this mode is either non-dynamical (the Fierz-Pauli mass 
term), or it is ghost-like (all other mass terms). In the above non-local theory it is both 
dynamical and healthy if z > 0. Finally, the projectors have also simplified our computation 
of propagators in the non-local setting thanks to their nice algebraic properties (2.7.36). 

We have also shown in more than one way an important aspect of local linear gauge theories, 
which is that the constraint structure which is due to gauge symmetry is such that it guarantees 
the rule Ay = 2 N^, i.e. that there are always twice as many degrees of freedom as there are 
dynamical fields. This is important to note because it does not hold in the case of non-local 
field theories in general. 


Non-local subtleties 

We have then paused to discuss some peculiar aspects of non-local field theory. We have 
mentioned that the usual variational principle applied on some non-local action cannot yield 
causal equations of motion, but that there exists a modification of that principle which respects 
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causality. The construction is inspired by the “in-in” formalism for the quantum effective 
action T, i.e. the action which controls the dynamics of some expectation value of the field 
operator. The corresponding variational principle requires initial data to be imposed, instead 
of boundary data, and thus provides an action-based description of irreversible systems. This 
is for example the case of non-local field theories, where the combination of non-locality and 
causality privileges the past with respect to the future and thus implies an arrow of time. 

We have also discussed the localization procedure which turns non-local equations into local 
ones by integrating-in auxiliary fields, and thus allow us to see the dynamical content of the 
theory. The auxiliary fields have constrained initial conditions, because this data corresponds 
to the fixed choice of □ inverse we do in the non-local theory. However, they obey dynamical 
equations of motion, so that iVf < 2N& in general. The only exceptions to this rule are the 
non-local formulations of local theories, where the auxiliary fields correspond to Stiickelberg 
fields and are thus pure-gauge. 

The presence of dynamical fields that have constrained initial conditions forbids any quan¬ 
tum interpretation of genuinely non-local theories, since one cannot implement these constraints 
at the quantum level, in terms of constraints on the Hilbert space, without spoiling unitarity. 
Thus, genuinely non-local theories are necessarily classical effective theories. 

We have then addressed the important issue of classical stability. Indeed, non-local theories 
often contain ghost-like or tachyonic dynamical fields, that are only seen in the localized theory. 
In the literature their constrained status has often been invoked in order to minimize their 
impact on stability. We have argued that, on the contrary, they should be considered on the 
same footing as regular dynamical fields in a stability analysis, i.e. they are very capable of 
destabilizing a given solution of interest. This is because, for genuinely non-local theories, these 
fields interact non-linearly and are thus excited whatever their initial conditions, making the 
initial data constraints irrelevant in a stability analysis. The latter must therefore be performed 
just as a in the case of unconstrained dynamical fields to decide whether some solution is stable 
or not. 


Non-local gravity and cosmology 

Armed with what we have learned in the previous chapters, we finally went on to construct 
generally-covariant non-local theories of gravity, massive or not. We saw two ways to proceed, 
the action-based one and the projector-based one, in order to guarantee the transversality of 
our equations. Having constructed a class of models, some simple phenomenological constraints 
have narrowed it down to two models, the £-M projector-based model (4.4.10) 


d- 1 
2d 


m 


1 R) T = 87 tGT^ , 


and the £-MM action-based model (4.4.15) 


S = M 2 I d D x 


d 1 n ~ 9 

R -— m 2 RD~ 2 R 

4 d 


(6.0.5) 


( 6 . 0 . 6 ) 


where □ = □ — £R. These are not theories of massive gravity, since the tensor modes are 
massless, although for the £-MM one could add a Weyl-squared term WO~ 2 W to make them 
massive without spoiling background cosmology. These are one-parameter extensions of the 
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models proposed by Maggiore [73] and Maggiore - Mancarella [75], corresponding to the case 

i °- 

In the limit £ —y oo one obtains GR with a cosmological constant A ~ m 2 , so the phe¬ 
nomenology of these models should he between the £ = 0 ones and ACDM. We have confirmed 
this for the cosmological background through both a numerically analysis and analytic approx¬ 
imations. For £ > 0 we found that both models admit de-Sitter solutions, although they are 
unstable in the projector-based case. Indeed, there the future universe ultimately leaves the 
de-Sitter phase to settle on a to = -1/3 phase. 

These theories share the same linearized limit and contain a scalar ghost. However, the 
latter is ultra-light and the divergence is expected to manifest itself only at cosmological time- 
scales of the order of the age of the universe. This has been confirmed by a recent study of the 
perturbations for £ = 0 [76], i.e. the divergence is too slow to spoil the observational tests. In 
the £ = 0 action-based model, the ghost dynamics are even bounded, which is explained by a 
big rip singularity in the future. It implies that at some point Hubble friction will dominate, 
thus diluting the perturbations, and it appears that this domination occurs already shortly 
after today. The £ = 0 models have both been recently found to be consistent with the present 
data, and as privileged as ACDM, through a full Boltzmann/MCMC analysis [78]. This should 
therefore also hold for the £ > 0 models since they he somewhere in-between. Although 
considering one more parameter (£) for models that already work perfectly well can only lower 
their predictive power, we find interesting to have a parameter that continuously bridges to 
GR with A > 0. 
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Appendix A 

Bi-tensors 


In this appendix we define the notion of bi-tensor, the mathematical structure behind generally 
covariant Green’s functions, and discuss some properties that are going to be useful for our 
purposes. 


A.l Definition 

Just as higher-rank tensors are constructed using the tensor products (in the sense of fibre 
bundle theory) of vectors and covectors, bi-tensors can be constructed through some other type 
of tensor product of ordinary tensors. In order to formalize this construction, it is convenient 
to first remind some properties of ordinary tensors and of the corresponding tensor product. 

A. 1.1 Tensors 
Manifold &: scalars 

We start with a D-dimensional real differentiable manifold M. with atlas A i.e. a set of pairs 
(Ui,fi) of open sets Ui C M. and homeomorphisms 

fi : Ui^R D 

P^xf, n = 0,l,...,d, (A.1.1) 

such that the Ui cover all of M. and the transition functions from to 

fij = fi o fj 1 : fj (Ui n Uj) -> fi (Ui n Uj) , (A.1.2) 

are smooth. Any continuous map (f : M — > M can then be represented by functions (fi from 
W D to M by pulling it back along some /~ 1 

<pi = <l > 0 f - 1 : fi(Ui)^R. (A.1.3) 

A scalar field is then defined as such a map for which all (fi are smooth. Inverting we get 
(f = (fi o fi, so on Ui n Uj we have 

(fi° fi = (fj ° fj , => (fj = (fi ° fij i (A.1.4) 
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which is nothing but the transformation rule for a scalar function 

(f>i(xi) = <t>j{xj) , (A.1.5) 

under the coordinate transformation Xi = fij(xj). Since the Ui cover M. and the _/) are 
homeomorphisms, we have that the 4>i functions fully determine (f). Finally, we note that the 
set of scalar fields, denoted by C°°(Xi), forms an algebra whose addition and multiplication 
operations are the ordinary point-wise addition and multiplication in the target space M. 


Tangent bundle 

We then consider the tangent bundle T 1 ^. This is a 211-dimensional differentiable manifold 
along with a continuous surjective map i r : T l M —> M, such that ir _1 (p) — R D for all p € A4. 
In fibre bundle language, T l M is the total space, AA. is the base and M 11 is the fibre. This 
structure means that T l M locally looks like AdxM^ 1 , i.e. every point of A4 has a neighbourhood 
UiCM such that ir~ l (Ui) ~ t/j x R- 0 . As a matter of fact, once ir is given, we restrict the atlas 
of Xi to the charts whose open set Ui is small enough to satisfy this condition, i.e. to the sets 
which “trivialize” the fibre bundle. The atlas of the tangent bundle A t i m is then constructed 
out of A a/( as follows. For every chart (Ui, fi) E Aj k we pick an open set V) E T x X4 and an 
homeomorphism 

9i : Vi^R 2D 

q^{ x i,ki), (A.1.6) 


such that 

7r (Vi) = Ui, (f i oirog- 1 )(xi,ki)=Xi, [JV i = T 1 M, (A.1.7) 

i 


i.e. gi is such that the function associated to the projection map is the trivial projection onto 
the base coordinates 1 . Moreover, the set of charts (V),^) must be such that the corresponding 
transition functions 


9ij — 9i ° 9j ■ 9j {Vi Fl Vj') y gi (V) D Vj) , 


(A.1.8) 


read 


9ij { x j i kj ) 




(A.1.9) 


A set of such pairs (V), gi) constitutes an atlas A T i M for T l M.. The appearance of the transition 
functions of A4 in the transformation of the fibre coordinates in (A. 1.9) shows that the structure 
of T l M is naturally induced by the one of M. 

A vector field X is a section of this bundle, i.e. a continuous map X : A4 —> T l X4 that is 
a right-inverse of the projection n o X = id^(. It can be expressed through local 
functions, i.e. on Ui we define its pullback Xi = X o gj l , which by the property ir o X = id^ 
has the form 


Xi 


fi{Ui) ->• 9i{Vi) 


(A.1.10) 


1 Tlie fact that we can cover T 1 M with as many Vi as there are Ui is possible because we have demanded 

that 7r _1 (C/i) ~ Ui x R d . 
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and the Xj l (xj) are required to be smooth. As for the scalars, the full set of Xj functions fully 
determines X. Since the projection map is trivial, the relevant information ultimately lies in 
the functions Xj‘ (x) that are what one usually refers to as “the local components of the vector 
field” on Ui, 2 . As in the case of scalar fields, we can invert X = X t o g, and have that on 
Ui n Uj 

Xj o gi — Xj o fjj , =^- Xj — Xj o (jjj , (A.1.11) 

which, given (A.1.9), translates into the well-known rule 

r)x^ 

(A.1.12) 

under the coordinate transformation Xj = fjj(xj). The set of sections, denoted by T(T 1 Ad), 
forms an C°°(.Ad)-vector space, whose addition and multiplication by a (j) G C°°(JA) operations 
are defined on each Uj through the functions Xj 1 , which then determine the resulting vector 
field 3 . We have that if Xf , Y] 1 and 4> i are the local functions associated to X, Y and q i», 
respectively, then the local functions of X + Y and 4>X are given by Xj 1 ' + Y] 1 and tpjXj 1, , 4 . 

At this point we can make contact with the alternative definition of a vector field which is 
as a derivation on i.e. an M-linear operator D x '■ -» C°°(Ai) obeying the 

Leibniz rule 

D x (a<j) + P<j>') = aD x 4> + /3D X (/>' , D x (#') = {D x 4>) 4> + 4> D x(fr' , (A.1.13) 

where a,/3 are real constants. Indeed, these properties fully determine D X - if 4>i denotes 
the local functions of (j) then the local functions of Dx<i> are Xj l d^4 > i, for some functions Xj 1 
which we can identify with the fibre components of a section (A. 1.10). Indeed, the fact that 
Dx4> € implies 

• < a - li4 > 

which is precisely (A.1.12). It is a common abuse of terminology to call this derivation the 
“vector field”, in which case the d^ form a basis of vector fields. 

Finally, anticipating the generalization to tensors, we must look for yet another operator 
interpretation of vector fields. To that end we can define the cotangent bundle T\M following 
the same steps as we did for T 2 A4, only this time with coordinates (xf, kj v j and with transition 
functions obeying 

(fij(xj))k^j . (A.1.15) 

A covector a is then a section of T\jv[. and has a natural action as a linear functional a : 
r(T 1 M) —> C°°(M). Indeed, its associated local functions otj = a o 

: fi{Uj) ^ gi {Vj) 

2 The advantage of the section representation is that it is global and thus unique, while the X^(x) information 
is local and contains as many functions as the number of Ui that are needed to cover M. 

Tndeed, we cannot define these operations using directly the maps X and Y because their target space is 
not a space of numbers. 

4 Note that scalar fields can also be expressed in this fibre bundle language as sections of T a M. The base 
is still M , the fibre is just R, the transition maps are trivial £ij(xi,kj) = and the scalar fields are 

sections which in local coordinates are given by functions xf (xf, tj>i {xi)). The addition and multiplication 
operations on T(T°A4) must then be defined through the local functions. 


( df p 

9ij{ x ji k j) = I fij( X j), 31 


dx v - 
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x i ^ (x^a iv (xi)) , 

(A.1.16) 

transform as 


djX 


Oti^Xi) - d ^(Xi(Xj)) Ot jv (Xj) , 

(A.1.17) 

under the coordinate transformation x,; = fij(xj), and thus 


M x i) - on^Xi)X^(xi) = a J >(xj)-X'j*(xj) = <f>j(xj) 

(A.1.18) 


transforms as the local function on Ui of a scalar. This defines the interior product X ■ a G 
C°°(X4). Just as provides a basis for vector fields, because of its transformation properties, 
so does the differential dx M provide a basis for covectors 


d < = qJ, dx j > (A.1.19) 

and we have the analogue of (A. 1.14) 

dxf = oij v (xj) d Xj . (A.1.20) 

Alternatively, the vectors can also be interpreted as linear functionals on r(Ti(AJ)). It is this 
dual linear operator interpretation that generalizes straightforwardly to the case of higher-rank 
tensors. 


Tensor bundle 

Having defined T l Xi and T\M we can construct the tensor product bundle 

— . (A.1.21) 

m times n times 


The <S> operation means that one takes the tensor product of the fibres at each point p 6 A4, but 
keeps the same base manifold A4. The fibre coordinates will therefore take values in the vector 
space generated by ... kj^k’j' 1 ... k^ n , thus corresponding to k^ coordinates. So T^Xi 
is a (D + U n+m )-dimensional differentiable manifold with a projection map ir : T^Xi —> Xi 
and fibre tt^ 1 (p) ~ R Dn+m . The set of charts (gi, V) 

a : k ^k d+d " + " 

. (A.1.22) 

is such that 


ft (Vi) = Ui , (, f i o7rogr 1 )( Xh k i ) = x i , \JVi = T£M, (A.1.23) 

i 

and the transition functions gij = gio gj 1 are of the form 

Q x lf 1 - Qrfrn 


df ai df am df Ul df Vn 

9 ij (xj , kj ) = ( f£( x ), (faixj)) ... -T^hUaixj)) *£:::& 


(A.1.24) 
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A tensor of rank ( n,m ) is then a section of Tj'JjAd, i.e. a map T : Ad —> T'^JVl that is a 
right-inverse of the projection it o T = id_A/j. Thus, defining the local functions 7) = T o / t _1 we 
have 


Tr : fi(Ui) ^ 9i (Vi) 




(A.1.25) 


and the fibre components TT'(x ? ;), given (A.1.24), transform as 


dx ^ 171 F) r r ^ f)rpVn 

> (A - L26) 


under the coordinate transformation x$ = fij{xj). As in the case of (co-)vectors, by a slight 
abuse of language, one usually calls TjT 1 (xj) the components of the tensor field. The 
addition and multiplication by a scalar operations are defined through the local functions just 
as in the case of vectors. We can now include the tensor product among the operations of 
interest, which is also defined through the local functions. If T £ T(T^Ad) and S £ T(T r s Ad), 
then T <g> S £ T(T^\_ s r M) is given by (T <g> S)i = (T <g> S) o /r 1 


(T <8) S)i(xi) 



rp V\...V n \ O V n +l...V n + s 



(A.1.27) 


Finally, using the A = and a = a^dx M interpretation of (co-)vectors, the “basis of 

T(T^Ad)” in this case is the tensor product 


dx Ml <g) ■ ■ ■ <g> dx Mm ® d vi <g> ■ ■ • <g> , 


(A.1.28) 


where here <8> means “multiplication and evaluation at the same point of Ad”, so that 

d _ _ d „ „. 


d < 10 - ■ 


_ rp V 1 ...Z/j 


(xj) dx) 41 ®- ■ -®dx^" 


5 


dx" 1 


dx” r 


(A.1.29) 


A.1.2 Bi-tensors 
Bi-manifold & bi-scalars 

We now wish to construct tensor-like fields that depend on two points of Ad. We therefore 
begin by defining the Cartesian product Ad 2 = AdL x AdR, where these are two copies of Ad 
that we will call the “left” and “right” ones. In the above product it is understood that Ad 2 
has the product topology and atlas Am l x Aa/( r , i.e. the one made of the pairs 

(E%, fi\j) = & x Uj, h x fj ) . (A.1.30) 

Thus, a chart on Ad 2 is a pair of open sets, one on AdL and one on Ad r, followed by a pair of 
functions that coordinatize each open set independently. The product topology gives 

u Aj n u w = ( Ui x Uj) n (u k x u t ) = ( u t n u k ) x (Uj n u t ) , (A. 1 . 31 ) 
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and the same for the union operation, and the transition functions decompose 

fik\jl = fi\k 0 f~\l = (fi ° fk 1 ) x (fj ° fr 1 ) > (A. 1.32) 

so that these two manifolds do not “see” each other, i.e. one can perform coordinate transfor¬ 
mations on each one of them independently. A bi-scalar field is a map 4> '■ Ad 2 —> M such that 
the functions 


^• = 00 /- 1 : fi(Ui) x fj(Uj) —>■ M 

( x i,Vj) H- (t>i\j , (A.1.33) 

are smooth in both arguments. Following the same steps as for the ordinary scalar field, its 
transformation under independent coordinate transformations x\ = fikpk) and yj = fji(yi) is 
thus 

<t>i\j( x i,yj) = 4>k\i(xk,yi) • (A. 1.34) 


Bi-tensor bundle 


We can now define the bi-tensor bundle B'p\ s r M. as follows. It is a differentiable fibre bundle 
of dimension 2D + £) n + m + r + s : based on At 2 , with projection map ttb : -E>”'J*A4 —> M 2 and 
fibre ir^( pliPr) — M Z)n+m+r+ \ Its atlas A B h \s M is constructed as follows. For every pair 
(Ujij. f t \j) G A m 2 , we pick an open set V t u C \ s r M. and a homeomorphism 


9i\j 


V i{j M 2 D+D^ +r+s 


q !-)• 


(-V U vi-Vn I oi.-.a-A 
\pi > yj > ^i l x 1 ... l x m \i P1 ... Pr J > 


(A.1.35) 


such that 

^B(Vi\j) = U Aj , (/i|i o ttb o g~^ (xi, yj, k^) = {x h yj) , |J = B^\ s r M , 

hi 

(A.1.36) 

and the transition functions Qik\ 3 i = 9i\j ° 9 ^ are of the form 


9ik\jl (xki yii ^‘k11 ) 


/ f) f“1 

(/&(**)’ fjliyi )> dx % (. fik{xk )) • • • 

f)f a ™ 


a/*", x ^ ^^ 

dp™ 

Q y Pm ( MVl )) 

dP 1 

4 {v,) " 

Jjl (w) k 01 -P n 1 5l - 

Qy$n ^ ' kon—otm ^71- 

-Ss ] 
•OV 1 



(A.1.37) 


Note that we have used a column to distinguish between the two types of indices, i.e. the 
“left” ones mixing with Jacobians evaluated at the left point x, and the “right” ones mixing 
with Jacobians evaluated at the right point y. A bi-tensor G would then be a section of 


129 



\ s r M., i.e. a continuous map G : M 2 —> \ s r M. that is a right-inverse for the projection 

map ttb o G = id_vj 2 . Thus, defining the functions G t y = G o , in local coordinates 


the local components G,j' 1 '" ! '” 

^ l-P’l—Hm' 3 Pl—Pr 


Gi\j : /i|j(Gj|j) —)• 

«.#j) ~ . 

'"p S (xi,yj), given (A.1.37), transform as 


(A.1.38) 


f)r ai f) T a m f)™ 1 '™ 

G£r*: \^(xi, yj ) = ^(x,y...^(x,y^w..4w 


lUl ...p m \Jpl...p r 


dxf 1 

dy 7 1 


Sxf 1 




<9y7 r SiA 1 dy° s 

(yj(yi)) ■ ■ ■ iov (%(yz)) -Trotyl) ■ ■ • 


QyPl WWJ--- QyPr 

Pl-Pr I <5l...<5. 


J “ a 3 ®Vl 1 


dyf 


Pl-"Pr I 0l...0 s / \ 

A<J fcai...a m U7l...7 r V x fc> > 


(A.1.39) 


under the independent coordinate transformations Xj = fipXk ) and yj = fji(yi)- In order to 
express such an object in the notation (A. 1.28) we need to define a new kind of product. We 
thus use the notation 


pi •••Pm 'Pl“’Pr 


G^VVr I7. 1 "'7 a (x, y) ( dx w <g> ■ ■ ■ <g> dx Mro <g> 


3 


5 


<8 >b d y pl <8> ■ ■ ■ (8) dy Pm <8) 


9x Vl 
5 


dx Vn 

d 


dy° 


dy° 


(A.1.40) 


and dub <8 >b the “bi-tensor” product, which means that the tensors on each side are evaluated 
on independent points of M. This notation is again consistent with the transformation rule 
(A. 1.39) given the way the basis transforms. It is then straightforward to generalize the concept 
to bi-tensor densities and also to “tri-tensors”, “quadri-tensors” etc, by taking more and more 
bi-tensor products. 


A.1.3 Bi-tensor calculus 
Differentiation 

Since a bi-tensor basically “lives” on two points of the manifold, and their corresponding 
tangent tensor spaces, it can be covariantly differentiated at each point separately. Indeed, 
the transformation (A. 1.39) implies that one can apply covariant derivatives at each point 
separately, and with respect to the corresponding indices only, because x and y are independent. 
One must simply let the notation reflect the choice of point, so we will use Vl and Vr for the 
operators on bi-tensors, while we will use V^| and V|^ for their representation on the bi-tensor 
components. For example, given G E T(Sq|JA4), 

V„| GT(x, y) = ^ GT(x, y) + T\p X ) G a |£(x, y ), (A. 1.41) 

are the local components of VlG E T(.B-Jj}.A4), while 

VU GT(x, y) = ^-r GX(x, y) + T P afl (y) GT(x, y) - G"|S(x, y)T a ail (y ), (A. 1.42) 
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are the local components of VrG G IX-SqI^-M). Pay attention to the various dependencies and 
index contractions. With this additional information the commutator of covariant derivatives 
generalizes accordingly. We have for instance 


[V M |, V|„] G p \°(x, y) 
[V M |, V„|] G p \°(x, y) 
[V| M , V|i/] G p \°(x, y) 


0, (A.1.43) 

R p atlu (x)GTAx,y), (A.1.44) 

R\^y)GTr^y) - Gr a {x,y)R a T ,Ay) ■ (A.1.45) 


Integration on A4 

Remember that integration is defined on manifolds by splitting the integral through a partition 
of unity subordinate to the open cover [7* and evaluating the integral on each U t using the local 
functions. More precisely, let us denote by I the set of indices indexing the open sets Ui of 
Am- We can then pick a locally finite covering I' C I, i.e. a subset {Ui} ieI , that still covers 
M but such that for every p G M there exists only a finite number of Ui for which p G Ui. 
Smooth manifolds which admit such locally finite refinements are called “paracompact”. Then, 
a partition of unity subordinate to {Ui}i £ j' is the attribution of a scalar p\ to each U{ with 
i G /' such that 

• supp(pi) C Ui , 

• Eie/'P* = 1 - 

The integral of a scalar field 4> over A4 is then defined as follows. One first needs to define a 
measure, i.e. a D-form oj such that the local density functions 




(A.1.46) 


have positive definite sign u>i(xi) > 0. Given a metric tensor g , the physically sensible choice is 
= \/—gi(xi) where gi(x t ) is the determinant of g tlJM (x t ). We then have that the integral 
of (j) is given by 


UJ 0 = 


IM 


-E 

ier ■ 


d Xi pi{xi)ui{xi)^i{xi 


(A.1.47) 


where pi, 0 Ji and fa are the local functions of p. u and <fi on Ui, respectively. The sum in the 
right-hand side is well defined because for each i G I' only but a finite number of elements are 
non-zero. 

The generalization to bi-tensors is straightforward. It relies on the fact that if p\ and pf 
form partitions of unity of A4 l and A4 r subordinate to their respective atlases, then p\p h 
forms a partition of unity of A4 2 subordinate to j)^p 2 - A s f° r differentiation, one can 

then define the integration on A4 l and A4 r independently. For example, given G G T(R”JqA 4), 
which is a scalar on .Mr, one can define 


by specifying the local functions on Ui 



ujG, 



V\...v n 
if-i\ ...fArn 


( Xi ) = 

jei'' 




Vj pf(yj) -( // 


Q V\...V n 


3 j ../i m \J 


i( x ii Vj) 


(A.1.48) 


(A.1.49) 
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Given the independence of the two space-time points, the above object is clearly an element of 
As is usual in the literature, we will use a slightly less rigorous notation to describe 
such integrals, i.e. one that does not care about how the integral is partitioned or about the 
fact that usually several coordinate charts are needed. In this case for instance we can write 


(f ^o)z:;XSx)= f d D yu {y)G^;;Z\{x,y), (A. 1.50) 

\JMb. J Jm 

so that one can see with respect to which manifold we are integrating. Finally, we define the 
following notations. For T E and T' E 


(G-.T)^ (*) 


[ d D yu{ y ) g%:X (*, y) (y ), (A.i.si) 

J M 

[ d D xu(x) T£;; p y (x) (?£;;;£,|“ (x, y) , (A. 1.52) 

J M 


are also elements of FfT^.A/f) and respectively. We thus have that, for any measure 

cv. the elements of \™M) can be thought of as left-M-linear endomorphisms of T(T^Ai) 
and right-M-linear endomorphisms of T(T™Jvl). Finally, since in the physically relevant cases 
uj(x) = y/—g(x), a dot without argument means 


A.2 Bi-tensor distributions 

The notion of bi-tensor combined with the notion distribution, ultimately allows to define the 
notion of functional analysis on manifolds. The most interesting cases for us are the Dirac 
delta bi-tensor and the Green’s bi-tensor. Disclaimer: here we will only focus on the aspects 
of the generalization of these notions to curved space-time. We will not concern ourselves with 
the functional analysis side of the field, i.e. we will not care about domains, continuity and 
convergence issues that are nevertheless crucial aspects of the theory of distributions. 


A.2.1 The Dirac delta bi-tensor 


The Dirac delta bi-tensor is defined, as the ordinary Dirac delta, by its distributional properties. 
The (”)-Dirac delta bi-tensor associated to the measure u is the bi-tensor A E 
satisfying 

A - U T = T, T'- Ld A = r, (A.2.1) 

for all T E T (T” At) and T' E T (). This uniquely determines its associated local func¬ 
tions, which are of course going to be related to the Dirac delta function. The latter transforms 
as a scalar density of weight —1 under a diffeomorphism x' = f(x). Indeed, 


1 = j d d x'5 <kD \x') = j d°x det ^j~S x ) 5 <yD \f(x)) = j d D x5 <yD \x), (A.2.2) 


so 


^ D \f{x )) = det 


'dj_ 

dx 



d( D \x). 


(A.2.3) 
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Thus the combination 8 ^ d \x)/lj(x) is a scalar. Repeating with the shifted diffeomorphism 
f{x) -> f(x) - f(y), we get 


8 {D) (f(x) - f{y)) = det 


df ( x 


-i 


^ (jD) (® - y) 


so that 


5^ (x — y) 5^ (x — y) 5^ (x — y) 


(A.2.4) 


(A.2.5) 


w(*) w (y) ’ 

are all the same scalar w/ien x and y are coordinates of the same chart , and thus transform 
together under the same transition functions. We can now make the link with the Dirac delta 
bi-tensor. To fully determine the latter it suffices to determine its local functions = Ao/.'ji 1 
on the open sets C/ju. We then have that 


U\... I'n I <T\...(Jrr 

\ jpi ...pn 


fa, yj) = 0 , Ui n Uj = 0 e M 


(A.2.6) 


while, if Ui n Uj is non-empty in M, 


^1 • • • V n I CTl... (Jm 
i/-Ll ...flm 'j pi ■••pn 


dfjl 


dfji 


df-l 


( x . y.) - (f-(y)) " Jl (y) d ^ (y) 6{D) ^ Xi _ 

{Xu Vj) ~ dx f 1 '' ' dxdyf dy p f uJi{ Xi ) 

(A.2.7) 


The latter is obtained by considering the case i = j 


...1S n I (Ti...(Tm 
ipi...prn \ip\...pn 




Pn 


S (D \xj - yj) 

Ui(Xi) 


(A.2.8) 


and transforming the right coordinate y, to y-j using the transition function /,y. An important 
property for what follows is the one involving the left and right-differentiations 


V l A = —VrA , A - w VT = —(VT) - u A, 

which is proved by convolution with test tensors and integration by parts. 


(A.2.9) 


A.2.2 Bi-tensor Green’s functions 

Let L[V] denote a covariant differential operator acting on r(T^At), i.e. the space of (,")- 
tensors. In terms of local components we thus have 5 


( T r r\ L ' 1 --‘ L 'n — Tl'1-..l'n \(Tl...(Tm r ppl---pn 

\ ±J± ) pi... flm — ■ U Hi...Hm\pl—Pn ± ^ * 


(A.2.11) 


Note that because of the derivatives the kernel Ker[L] is non-zero, i.e. there exists T such that 
LT = 0. There are therefore, roughly speaking, as many inverses of L as there are elements in 

5 The bi-tensor notation here might appear misleading since L is made of differential operators acting on a 
single space-time point, but since it is an endomorphism on F(T^Al), we can express it as the convolution with 
a bi-tensor indeed. We just need to rewrite 


(*) = / a" 


|Ml ■ 


l ^y) L Xl 


r 'T'Pl ■ 

fTl . . 


fy), 


(A.2.10) 


and then integrate by parts the covariant derivatives in L so that they act on A. The boundary terms drop 
because of the Dirac delta in A and the result is the convolution of T with a bi-tensor. 
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Ker[L]. A Green’s function for L is a bi-tensor G E such that its local functions 

satisfy 


TVl-Vn 1(' T \ f}Pl-Pn iMr-Mm/ \ _ At'l-I'n iMl-Mm/ 


(A.2.12) 


where here it is the Dirac delta bi-tensor associated with yj—g that is being used. Given such 
a bi-tensor G , we have that the operator 


L^T = GT, (A.2.13) 

is a right- inverse of L, i.e. 

LLg 1 = id r(T% M ) ■ (A.2.14) 

In this thesis, we will only focus on right-inverses that are M-linear operators 

L~ l (aT + a'T 7 ) = aL~ 1 T + olLT x T l , a, ol E M, constant (A.2.15) 

and which can therefore be expressed as the convolution with a Green’s bi-tensor 6 . On flat 
space-time we have that the bi-tensor structure of G simplifies considerably. For the local 
functions corresponding to the same charts on Ml and .Mr, i.e. when x and y are in the same 
coordinate chart, the converse property (A.3.1) along with Poincare covariance imply that all 
Green’s bi-tensors can be expressed in terms of a Green’s function 


G2" M u rr»r(x, y) = ... S^S u j ... 6%G(x - y) 


(A.2.17) 


where 

{LG){x) = d( D \x). (A.2.18) 

We will use the “r” subscript when referring to retarded Green’s functions, i.e. those that obey 


G r "\-;{x, y) = 0 , unless y is in the past light-cone of x . 


(A.2.19) 


On flat-space time this condition uniquely determines G because it totally determines the initial 
conditions 


lim d™ o G(x) = 0, 


(A.2.20) 


aw—>—oo 


where n goes from 0 to the degree of L. For instance, the retarded Green’s function of L = 
□ — m 2 reads 

d D k exp [iig^k^x 1 ') 


G t (x) = lim , . , „ „ , 

£-*•0+ J (2ir) D (fcO _|_ _ ^2 _ m 2 

and in D = 4 takes the simple form 


(A.2.21) 


0,(x) = 


5(\x\‘)-e(\xf) 


2 . mJ\ {m\x\) 


2\x\ 


c These must be contrasted with the more general case where the right-inverse is given by an affine operator 

= h + (A.2.16) 

with h £ Ker[L] a homogeneous solution of L that is independent of T. 
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1 

An 


5(x° — |x| 


-8(x")9{\x\ 2 ) 


2 . mJ\ (m\x\) 


(A.2.22) 


where 

\x\ = v x^x v , \x\ = J 5ijX l xi , (A.2.23) 

and J\ is a Bessel function of the first kind. We see that G r (x — y ), seen as a function of y , 
has a singular part which is supported only on the past light-cone of x and a non-singular part 
which is supported on the inside of the cone. The latter vanishes in the m —> 0 limit, consistent 
with the fact that the information then propagates only at the speed of light and its trajectory 
is thus stuck on the cone. Finally, note that the domain of definition of L~ l are the tensors 
that vanish sufficiently fast at past infinity for m/0 and past null infinity for m = 0. 

The generalization to curved space-time presents the following subtleties. First of all, the 
retarded Green’s bi-tensor of □ — m 2 is still supported inside the past light-cone, it is just that 
the latter is now non-trivial. Indeed, there might be more than one geodesic linking a given 
pair of points, the most striking example being the gravitational lensing effect. Second, one 
needs to impose global hyperbolicity on the pair (M,g) in order to have a causal space-time 
with a past that extends to infinity, and in which case the past light-cone would also extend 
to the infinite past. In that case, the domain of definition of L~ l are the tensors that vanish 
sufficiently fast at past infinity. More precisely, since □ — m 2 is second-order, taking t to denote 
the global time coordinate (Geroch’s theorem), we need 


lim T = 0, lim V N T = 0, (A.2.24) 

t->-—oo t —>— oo 


for any past-pointing time-like N (light-like for m = 0). Since in practical calculations one 
may have other differential operators acting on T before L” 1 , imposing the above condition 
will not suffice in general, so we will need to be more conservative. If C x denotes the interior of 
the past light-cone of x, then we will demand that supp(T) n C x is compact for all x and will 
refer to such tensors as tensors with “finite past”. 


A.3 Green’s bi-tensor properties 

A.3.1 Converse of (A.2.12) 

Here we show that (A.2.12) holds also when one acts on the point y instead of x 

TV 1 ...<7m\Vl-Vn VV7 ]( y \ " 'Kn I pi.. .p n ( \ _ A Ml ■■ -Mm I VI ...V n f \ 

Indeed, acting with T[Vii](y) on (A. 2 .12 ) and using (A. 2 . 9) we get 


T v l- v n |Ai...A„ 
•Vl-A4>l- K n 




/l’l”*/^m ' Pl-’-pn 
= T V i- u n |A 1 ...A„ 


[v R ](!/)AS:il T:r(x, y ) 


Thus, the convolution with a test tensor on A4 l> using [Vl, Vr] = 0 , gives 


d D x [v L ](x) 


(A.3.1) 


(A.3.2) 


rh-A |Ai...A„ 


[Vr] (y) G%rX \ 6 “*r (X, y) Tj*;;;£ {x 
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(A3.2) 

i.b.p 


= ■ [ a D * GTw A£:£ C'r"r (*, y)L^jt:i: [wim K\:: : w 

= (mt± (»). 


(A.3.3) 


where in the second step we have integrated by parts and the boundary terms have dropped 
because of the Dirac delta. Comparing the first line with the last we get that the term in 
square brackets obeys the distributional definition of A, i.e. (A.3.1). 


A.3.2 Conditions for also being a left-inverse 

In general L^ 1 is not a left-inverse L^ 1 L ^ id, as it is most obvious when acting on h e Ker[L] 

L~ 1 Lh = 0. (A.3.4) 


The most general statement is rather 

(L~ l L - id) T € Ker[L] , VT e T(T^M) , (A.3.5) 

since applying L from the left will give zero. Note that in general the resulting element of 

Ker[L] will depend on g, because L does, and is obviously also M-linear in T. Indeed, because 
of the very existence of non-zero elements in Ker[L], left-inverses generically do not exist. To 
understand this intuitively consider for instance the operator d 2 in one dimension and the 
following acausal Green’s function 

G(t, t') = 6{t — — to)(t — t') — 6(to — t')9(t' — t){t' — t ), (A.3.6) 

so that the inverse operation is 

/ OO pt 

d t' G(t, = / d t'(t - t')f(t '), (A.3.7) 

-CXD J to 

and we get 

(d~ 2 d 2 f){t) - f(t) = -/(to) - - t 0 ) £ Ker [d 2 ] . (A.3.8) 

It is clear that with this definition, <9~ 2 is a left inverse d~ 2 d 2 = id only on the subspace of 

functions obeying /(to) = //to) = 0. Moreover, we see that the resulting element of the kernel 
is determined by the boundaries of the convolution, i.e. the support of the Green’s function 
with respect to the second argument. In the retarded case where to —> — oo the integral makes 
sense only for functions that decrease sufficiently fast at infinity, i.e. 

lim f(t) = 0, lim /(t) = 0, (A.3.9) 

t—>■—oo t — y —oo 

and then d~ 2 is a left-inverse. This is actually the case in any dimension and on arbitrary 
geometries, i.e. the obstruction to being a left-inverse is generated by non-trivial boundaries 
of the support of G. Now that we have understood this using the simplest example, let us 
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consider the case L = □ which is the one of interest in this thesis, for arbitrary dimension and 
for globally hyperbolic ( M,g ) so that the past light-cones extend to past infinity. We have 




J M 

[ d d y G r ;\;;Z\rl::Z (*, y) ( y ) 

JIA 

J M 


lu 


d d y G^;;Z !“(*> y)^l::Ziv) 




)u 


J M 


(A3.1) 


/W 


d J !/ v / =9fe) "SiW + . 


Ml 


(A.3.10) 


where A r is the normal vector to ZY and 


rr 


( y) = G 


— 1^1...Z^n I CTi ...dm 

I Pi •••pn 


(x,y)V N T^(y), 


(A.3.11) 


is the Wronskian of G(x,y) and T(y) with respect to the derivative operator Vat acting on y. 
The question now is: what is U? If the integrand we started with was smooth, then by Stokes’ 
theorem we would have that U is the boundary of the support of the integrand. However, since 
G r {x,y) is non-zero only when y is on the past light-cone of x, we have that it is actually a 
distribution, just like in the flat space-time case (A.2.22). Thus, the integration by parts has 
to be understood in the way it is used for distributions: the boundary term is supported on the 
boundary of the support of the distribution. Since in our case the integrand is supported on 
the past light-cone C x of x, the integral of the Wronskian is actually supported on dC x which 
lies at past (null) infinity. Thus, we have that the conditions that one must impose on T for 
D^ 1 to be a left inverse are (A.2.24), i.e. precisely the ones for which DjT 1 is defined anyway. 
We thus have that DjT 1 is also a left inverse on the domain of T(T r ” A4) where it is defined. 

It is quite interesting to see how this computation goes through in the massive case L = 
□ — m 2 since then the support of the Green’s function is inside the past light-cone so that 
dlA = C x . For simplicity let us work on flat space-time, since in that case we have an explicit 
result (A.2.22). We then see that we have the singular part of d -1 , which is treated as before 
and thus gives an integral supported at past infinity. The smooth part which is supported on 
the inside of the cone however has a non-zero limit \x\ —>• 0 from the inside of the cone 


lim G r (x) 

|s]-»-0+ 


1 f £(x° — |a:|) 


47T 


x\ 



(A.3.12) 


so the corresponding Wronskian boundary term is not zero and lies on £ x , not on d£ x . There¬ 
fore, if we wanted this to be zero for all x then we would need to impose T = 0. However, 
what we see is that the smooth part of G r (x) is actually constant on the light-cone, so that 
the Wronskian (A.3.11) is a total derivative. Thus, by Stokes’ theorem it also amounts to an 
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integral that is supported on dC x at past infinity. For generic space-times we would need to 
know the limiting behaviour of the Green’s function on the light-cone to answer the question 
of left-inversion. 


A.3.3 Commutation relations of L 1 

We are now interested in understanding the commutator [M, L _1 ] where M[V] is some dif¬ 
ferential operator. To do so we can simply act with the derivation [M, ■ ] on the equation 
LL~ l = id to get 

[M, L] L -1 + L [. M , L- 1 } = 0 . (A.3.13) 

Isolating [M, L _1 ] would require the use of a left-inverse which, as we have seen in the previous 
section, does not exist when acting on generic functions. We can make use of the weaker 
equation (A. 3. 5) to get the most conservative statement 

[M,L -1 ]r = -L~ 1 IM,L\L~ 1 T + A , X <E Ker [L\. (A.3.14) 

where X is M-linear in [M, L] L~ 1 T. For instance, in the case L = □ and M = V^, we get the 
following rule for the retarded inverse on a scalar field of finite past 

[v„, D,- 1 ] $ = Dr" 1 (i^V^nrV) , (A.3.15) 

i.e. there is no X part precisely because then d" 1 is also a left inverse. Isolating and 

restricting to an Einstein space-time R^ IU = k g ^ w , where k is constant, we get 

□ I T 1 V /i = (l - kO^T 1 ) V^D^ 1 . (A.3.16) 

Inverting the operator in the bracket in a causal way, we get 

V^D- 1 = (□ - k); 1 . (A.3.17) 


A.3.4 Displacing the indices of Green’s bi-tensors 

Since we only use metric compatible covariant derivatives [V, g\ = 0, we have that [g. L] =0 for 
any differential operator L. At the level of the Green’s bi-tensors we have that the isomorphism 
g between T(T^M) and T(T^_\M) 

rpV 1 ...V n -l _ rpv\...l> n / A Q 1 C 

induces an isomorphism between the Green’s functions of L in and the ones in 

Ft-^m+'i 1™-^]'-'^) which is found through 

(G • T)%;X (*) jd D y^^G%;;”^ (A.3.19) 

= J d D y yf^y) G%;X (*, y ) (y) (y) 

= g^ m+lVn (x) J d D y y/-g(y) |pJ;;;^L + 1 (*, y) Tg;;;^ (y), 


so that 


Ml - "Mwi+l ' Pi "’Pn — 1 ' a/tm-|-l^n \ / Ml •••Mm, I Pi •••pn ' 

Indeed, the latter trivially obeys L[Vl]G = A since [g,L\ = 0. 


,Pn Gm -\-1 


(y) 


(A.3.20) 
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